MDPs and RL

Lectures

These are notes that I use to prepare for class. These are not intended to be official lecture notes. In particular, I don't proof-read them carefully. However, if you do find typos, please let me know and I will correct them. You are expected to come to class and take notes. Please do not ask me to put up the notes before lectures, I may not always be able to do this.

1. Summary of the Course

2. Markov Chains

3. MDPs with Discounted Cost

4. Value Iteration/Policy Iteration/LP Solution

5. Q-Learning

6. Function Approximation Using Neural Nets

7. Back-Propagation Algorithm

8. Convergence of Q-Learning

9. Stochastic Shortest Path Problem

10. Linear Function Approximation

11. The ODE Method

12. Average Cost MDPs

13. Q-Learning for Average-cost-per-stage MDPs

14. Policy Gradient Algorithm

Report abuse