SC-652 Statistical Learning and sequential prediction

Instructor: Avishek Ghosh, Assistant Professor, SysCon and C-MInDS, IIT Bombay

Contact: Room 201, SysCon; Email: avishek_ghosh@iitb.ac.in 

Timing: Monday/Thursday  2:00 - 3:25 pm

Classroom: LT 201, Lecture Hall Complex-2

TA: Bhavini Jeloka (Email: bhavini.jeloka@gmail.com)

Scribe Format: See here 

Scribe Schedule : See here 

About the course: This course will cover fundamental topics in online (or sequential) learning. The course is roughly divided into 4 modules. In the first module, we will look into prediction problems with experts, and analyze algorithms like weighted majority and exponential weights (Hedge). In the second module, we will take up the topics of convex games being played with the environment, and obtain theoretically convergent strategies. Moreover, in the third module, we assume that the environment is stochastic, and study the framework of Multi-Armed Bandits. Finally, in module 4, we drive into Reinforcement Learning (RL). Here, we aim to obtain sharp convergence bounds (asymptotic and non-asymptotic given time) to some of the standard RL algorithms like TD learning, Q learning.

Grading: 2 HWs (20%), 1 mid term (25%), Final (40%), Scribe (10%) and Class participation (5%)

References: We won't follow any standard textbook. The following are the references for the modules (pdfs of all the following are available online):


Other resources: Apart from these, the course material will be taken from lecture notes of Prof. Arya Mazumdar's course on Applied Information Theory (UC San Diego), Prof. Aditya Gopalan's course on Online Learning and Prediction (IISc Bangalore), Prof. Kannan Ramchandran's course on Information Theory (UC Berkeley), Prof. Peter Bartlett's course on Statistical Learning Theory (UC Berkeley) among others.

Lectures:

General Guidelines for Homeworks:

Homeworks: