Reinforcement Learning
Instructor ORCS 4529/IEOR 8100: Reinforcement Learning
Department of IEOR, Columbia University
Topic 1: Markov
Decision Processes (MDP) and Dynamic Programming (DP) algorithms
· Slides
Topic 2: DP
based algorithms for Reinforcement Learning (RL)
· Slides
· Lecture
Notes: Tabular Q-learning and TD-learning
· Lecture
Notes: Q-learning with function approximation
Topic 3:
Policy optimization methods
· Slides
· Lecture
Notes: Policy gradient
· Lecture
Notes: Conservative policy gradient and TRPO
Other topics
(Slides only)
· Exploration-exploitation
in RL
Please use
this form to report an error
and/or give comments/suggestions. Thank you for your contribution!