데이터사이언스 & 강화학습
Data Science & Reinforcement Learning
Min-hwan Oh (minoh@snu.ac.kr, Office: 942-419)
Goals
Reinforcement learning is a general paradigm for learning to act under uncertainty, and it is applicable to a wide range of tasks, including robotics, game playing, user-interactive systems (e.g., recommender systems), and healthcare.
This course covers the fundamentals of reinforcement learning and its practices. The course aims to provide hands-on experiences in the algorithmic techniques (model-based, model-free, policy gradients, etc.) of reinforcement learning. The students will be well-versed in both the fundamental principles of RL and the implementation of (deep) RL algorithms.
Content
01. Course Overview and Reinforcement Learning Introduction
02. Multi-Armed Bandits
03. Markov Decision Processes
04. Dynamic Programming for Solving MDPs
05. Monte Carlo Methods
06. Temporal Difference Learning I
07. Temporal Difference Learning II
08. Planning and Learning I
09. Planning and Learning II
10. Prediction with Approximation
11. Control with Approximation
12. Off-policy Methods with Approximation
13. Policy Gradient Methods
14. Recent Advances in Deep Reinforcement Learning
15. Final Project Presentations
Textbook
The course will mostly follow Sutton & Barto, which is available for free:
- · Sutton & Barto, Reinforcement Learning: An Introduction, 2nd Edition.
- [PDF link] http://incompleteideas.net/book/RLbook2020.pdf
Grading Policy
- · Attendance & Participation: 10%
- · Assignment: 20%
- · Midterm: 30%
- · Final Project: 40%
Prerequisite
Undergraduate level of statistics, and familiarity with programming in python
Note
There will be programming assignments, based on Python + Tensorflow (or Pytorch) + OpenAI gym . Familiarity with deep learning tools, e.g., Tensorflow, Pytorch, can be helpful but is not necessary.