Categories

There are currently no items in your shopping cart.

User Panel

# Artificial Intelligence: Reinforcement Learning in Python

10.99 \$
Video Introducing this tutorial

Welcome :
Introduction
Course Outline and Big Picture
Where to get the Code
How to Succeed in this Course
Warmup

Return of the Multi-Armed Bandit :
Section Introduction: The Explore-Exploit Dilemma
Applications of the Explore-Exploit Dilemma
Epsilon-Greedy Theory
Calculating a Sample Mean (pt 1)
Epsilon-Greedy Beginner's Exercise Prompt
Epsilon-Greedy in Code
Comparing Different Epsilons
Optimistic Initial Values Theory
Optimistic Initial Values Beginner's Exercise Prompt
Optimistic Initial Values Code
UCB1 Theory
UCB1 Beginner's Exercise Prompt
UCB1 Code
Bayesian Bandits / Thompson Sampling Theory (pt 1)
Bayesian Bandits / Thompson Sampling Theory (pt 2)
Thompson Sampling Beginner's Exercise Prompt
Thompson Sampling Code
Thompson Sampling With Gaussian Reward Theory
Thompson Sampling With Gaussian Reward Code
Why don't we just use a library?
Nonstationary Bandits
Bandit Summary, Real Data, and Online Learning
(Optional) Alternative Bandit Designs
Suggestion Box

High Level Overview of Reinforcement Learning :
What is Reinforcement Learning?
From Bandits to Full Reinforcement Learning

Markov Decision Proccesses :
MDP Section Introduction
Gridworld
Choosing Rewards
The Markov Property
Markov Decision Processes (MDPs)
Future Rewards
Value Functions
The Bellman Equation (pt 1)
The Bellman Equation (pt 2)
The Bellman Equation (pt 3)
Bellman Examples
Optimal Policy and Optimal Value Function (pt 1)
Optimal Policy and Optimal Value Function (pt 2)
MDP Summary

Dynamic Programming :
Dynamic Programming Section Introduction
Iterative Policy Evaluation
Gridworld in Code
Iterative Policy Evaluation in Code
Windy Gridworld in Code
Iterative Policy Evaluation for Windy Gridworld in Code
Policy Improvement
Policy Iteration
Policy Iteration in Code
Policy Iteration in Windy Gridworld
Value Iteration
Value Iteration in Code
Dynamic Programming Summary

Monte Carlo :
Monte Carlo Intro
Monte Carlo Policy Evaluation
Monte Carlo Policy Evaluation in Code
Monte Carlo Control
Monte Carlo Control in Code
Monte Carlo Control without Exploring Starts
Monte Carlo Control without Exploring Starts in Code
Monte Carlo Summary

Temporal Difference Learning :
Temporal Difference Introduction
TD(0) Prediction
TD(0) Prediction in Code
SARSA
SARSA in Code
Q Learning
Q Learning in Code
TD Learning Section Summary

Approximation Methods :
Approximation Methods Section Introduction
Linear Models for Reinforcement Learning
Feature Engineering
Approximation Methods for Prediction
Approximation Methods for Prediction Code
Approximation Methods for Control
Approximation Methods for Control Code
CartPole
CartPole Code
Approximation Methods Exercise
Approximation Methods Section Summary

Interlude: Common Beginner Questions :
This Course vs. RL Book: What's the Difference?

Stock Trading Project with Reinforcement Learning :
Beginners, halt! Stop here if you skipped ahead
Data and Environment
How to Model Q for Q-Learning
Design of the Program
Code pt 1
Code pt 2
Code pt 3
Code pt 4

Setting Up Your Environment (FAQ by Student Request) :
Anaconda Environment Setup
How to install Numpy, Scipy, Matplotlib, Pandas, IPython, Theano, and TensorFlow

Extra Help With Python Coding for Beginners (FAQ by Student Request) :
How to Code by Yourself (part 1)
How to Code by Yourself (part 2)
Proof that using Jupyter Notebook is the same as not using it
Python 2 vs Python 3

Effective Learning Strategies for Machine Learning (FAQ by Student Request) :
How to Succeed in this Course (Long Version)
Is this for Beginners or Experts? Academic or Practical? Fast or slow-paced?
Machine Learning and AI Prerequisite Roadmap (pt 1)
Machine Learning and AI Prerequisite Roadmap (pt 2)

Appendix / FAQ Finale :
What is the Appendix?
BONUS: Where to get discount coupons and FREE deep learning material