Topics in RL
Description:
Description:
This course will discuss learning regret and sample complexity of reinforcement learning algorithms such as Q-learning and Actor-Critic algorithms. It will also cover the framework of interactive decision making and Decision-Estimation Coefficient, a measure of statistical complexity that lower bounds the optimal learning regret for interactive decision making.
Grading:
Grading:
Class participation, Midterm exam and End-term project.
References:
References:
- Stochastic Approximation: A dynamical systems viewpoint by V. S. Borkar (Second Edition).
- Recent papers published in machine learning conferences.Â