Reinforcement Learning

Lecture Notes by Shipra Agrawal

Instructor ORCS 4529/IEOR 8100: Reinforcement Learning

Department of IEOR, Columbia University

Topic 1: Markov Decision Processes (MDP) and Dynamic Programming (DP) algorithms

· Lecture Notes

Topic 2: DP based algorithms for Reinforcement Learning (RL)

· Lecture Notes: Tabular Q-learning and TD-learning

· Lecture Notes: Q-learning with function approximation

Topic 3: Policy optimization methods

· Lecture Notes: Policy gradient

· Lecture Notes: Actor-critic

· Lecture Notes: Conservative policy gradient and TRPO

Other topics (Slides only)

· Exploration-exploitation in RL

· MCTS based planning

· Multi-agent RL

Please use this form to report an error and/or give comments/suggestions. Thank you for your contribution!