Data Science Foundations: Python Scientific Stack

Video Introducing this tutorial


What you should know
Mac setup
Windows setup
Linux setup
How to use the exercise files

1. Scientific Python Overview

Ramp up with Scientific Python

2. The Jupyter Notebook

Start the notebook server
Use code cells
Extensions to Python language
Understand markdown cells
Edit notebooks

3. NumPy Basics

Overview: NumPy
NumPy arrays
Learn Boolean indexing
Understand broadcasting
Understand array operations
Understand ufuncs

4. Pandas

Pandas overview
Load CSV files
Parse time
Access rows and columns
Use pure Python packages
Calculate speed
Display a speed box plot

5. Conda

Introduction to Python packages
Manage environments

6. Folium and Geo

Create an initial map
Draw a track on the map
Use geo data with Shapely
Generate a report

7. NY Taxi Data

Examine data
Load data from CSV files
Work with categorical data
Work with data: Hourly trip rides
Work with data: Rides per hour
Work with data: Weather data

8. scikit-learn

Introduction: scikit-learn
Learn regression on Boston dataset
Understand train/test splits
Preprocess data
Compose pipelines
Save and load models

9. Plotting

Overview: matplotlib
Use styles
Customize Pandas output
Use matplotlib
Tips and tricks
Understand bokeh

10. Other Packages

Other packages overview
Go faster with Numba and Cython
Understand deep learning
Work with image processing
Understand NLP: NLTK
Understand NLP: SpaCy
Bigger data with HDF5 and dask

11. Development Process

Understand source control
Learn code review
Testing overview
Testing example


Next steps