Topics in RL

Description:

This course will discuss learning regret and sample complexity of reinforcement learning algorithms such as Q-learning and Actor-Critic algorithms. It will also cover the framework of interactive decision making and Decision-Estimation Coefficient, a measure of statistical complexity that lower bounds the optimal learning regret for interactive decision making.

Grading:

Class participation, Midterm exam and End-term project.

References:

Stochastic Approximation: A dynamical systems viewpoint by V. S. Borkar (Second Edition).
Recent papers published in machine learning conferences.

Google Sites

Report abuse