데이터사이언스 & 강화학습

Data Science & Reinforcement Learning
Min-hwan Oh (minoh@snu.ac.kr, Office: 942-419)

Goals

Reinforcement learning is a general paradigm for learning to act under uncertainty, and it is applicable to a wide range of tasks, including robotics, game playing, user-interactive systems (e.g., recommender systems), and healthcare.

This course covers the fundamentals of reinforcement learning and its practices. The course aims to provide hands-on experiences in the algorithmic techniques (model-based, model-free, policy gradients, etc.) of reinforcement learning. The students will be well-versed in both the fundamental principles of RL and the implementation of (deep) RL algorithms.

Content

01. Course Overview and Reinforcement Learning Introduction

02. Multi-Armed Bandits

03. Markov Decision Processes

04. Dynamic Programming for Solving MDPs

05. Monte Carlo Methods

06. Temporal Difference Learning I

07. Temporal Difference Learning II

08. Planning and Learning I

09. Planning and Learning II

10. Prediction with Approximation

11. Control with Approximation

12. Off-policy Methods with Approximation

13. Policy Gradient Methods

14. Recent Advances in Deep Reinforcement Learning

15. Final Project Presentations

Textbook

The course will mostly follow Sutton & Barto, which is available for free:

· Sutton & Barto, Reinforcement Learning: An Introduction, 2nd Edition.
[PDF link] http://incompleteideas.net/book/RLbook2020.pdf

Grading Policy

· Attendance & Participation: 10%
· Assignment: 20%
· Midterm: 30%
· Final Project: 40%

Prerequisite

Undergraduate level of statistics, and familiarity with programming in python

Note

There will be programming assignments, based on Python + Tensorflow (or Pytorch) + OpenAI gym . Familiarity with deep learning tools, e.g., Tensorflow, Pytorch, can be helpful but is not necessary.

Course