GA Data Science NY Section 7

Welcome to the General Assembly Data Science Handout page. Here I’ll be assembling handouts, walkthrough and links for everyone to have some references to follow-up after class.

##### Key Dates
• Feb 6: Project Presentations
• Feb 20: Data Review and Processing Presentations
• Last Day of Class: Final Presentations

###Lesson 1: Introduction to Data Science & Basic Data Manipulation

####Slides - Lesson 1 Slides

####Handouts - Unix Basics: Intro to the Command Line

###Lesson 2: Data Storage and Extraction

####Slides - Lesson 2 Slides

####Handouts - MySQL Tutorial - Introduction to Python - Python Exercises

###Lesson 3: Python and Data Manipulation

####Handouts - Python Exercises

###Lesson 4: Data Visualization ####Assignment 1: Due Jan 9

####Handouts - Pandas and Data Viz Notebook

####Slides - Lesson 4 Slides

### Lesson 5: Introduction to Machine Learning

####Slides - Lesson 5 Slides

####Handouts - Sklearn and KNearest Neighbors

### Lesson 6: Linear Regression

####Assignment 1: Due Jan 24 ####Slides - Lesson 6 Slides

####Handouts - Linear Regression

### Lesson 7: Logistic Regression and Regularization

####Slides - Lesson 7 Slides

### Lesson 9: Decision Trees and Random Forests

####Slides - Lesson 9 Slides

####Handouts - Random Forest on Text Data

### Lesson 10: Classification Review

####Slides - Lesson 10 - Review Slides

####Handouts - In Class Review

### Lesson 11: Ensemble Learning

####Slides - Lesson 11 - Ensemble Learning - Lesson 11 - K Means

####Handouts - Random Forest on Text Data

### Lesson 13: PCA and Unsupervised Learning

####Slides - Lesson 13 - PCA and SVD

###Links - A Tutorial on PCA - Stanford PCA Tutorial

#### Lesson 15: Further Topics in Unsupervised Learning

####Slides - Lesson 15: More Unsupervised Learning

#### Lesson 17: Distributed Data Processing

####Slides - Lesson 17: Spark