Reinforcement Learning

Lecture Notes by Shipra Agrawal

 

Instructor ORCS 4529/IEOR 8100: Reinforcement Learning

Department of IEOR, Columbia University

 

Topic 1: Markov Decision Processes (MDP) and Dynamic Programming (DP) algorithms

·       Slides

·       Lecture Notes

Topic 2: DP based algorithms for Reinforcement Learning (RL)

·       Slides

·       Lecture Notes: Tabular Q-learning and TD-learning

·       Lecture Notes: Q-learning with function approximation

Topic 3: Policy optimization methods

·       Slides

·       Lecture Notes: Policy gradient

·       Lecture Notes: Actor-critic

·       Lecture Notes: Conservative policy gradient and TRPO

Other topics (Slides only)

·       Exploration-exploitation in RL

·       MCTS based planning

·       Multi-agent RL

 

Please use this form to report an error and/or give comments/suggestions. Thank you for your contribution!