Reinforcement learning is a general paradigm for learning to act under uncertainty, and it is applicable to a wide range of tasks, including robotics, game playing, user-interactive systems (e.g., recommender systems), and healthcare.
This course covers the fundamentals of reinforcement learning and its practices. The course aims to provide hands-on experiences in the algorithmic techniques (model-based, model-free, policy gradients, etc.) of reinforcement learning. The students will be well-versed in both the fundamental principles of RL and the implementation of (deep) RL algorithms.
01. Course Overview and Reinforcement Learning Introduction
02. Multi-Armed Bandits
03. Markov Decision Processes
04. Dynamic Programming for Solving MDPs
05. Monte Carlo Methods
06. Temporal Difference Learning I
07. Temporal Difference Learning II
08. Planning and Learning I
09. Planning and Learning II
10. Prediction with Approximation
11. Control with Approximation
12. Off-policy Methods with Approximation
13. Policy Gradient Methods
14. Recent Advances in Deep Reinforcement Learning
15. Final Project Presentations
The course will mostly follow Sutton & Barto, which is available for free:
· Sutton & Barto, Reinforcement Learning: An Introduction, 2nd Edition.
· Attendance & Participation: 10%
· Assignment: 20%
· Midterm: 30%
· Final Project: 40%
Undergraduate level of statistics, and familiarity with programming in python
There will be programming assignments, based on Python + Tensorflow (or Pytorch) + OpenAI gym . Familiarity with deep learning tools, e.g., Tensorflow, Pytorch, can be helpful but is not necessary.