ECE7995: Online Decision Making
(Fall 2021)
Instructor: Xingyu Zhou, Wayne State University.
Contents: Introduction to mathematical foundations of online learning, including online convex optimization, bandits, and reinforcement learning, mostly about theorems and proofs. Meanwhile, it also covers basic Python implementations of commonly used algorithms.
Grades: 3 homework assignments and 1 final project
Lecture Notes (* notes may contain various typos or errors. Feel free to let me know:-))
Lecture 1 -- Intro [note][annotated]
Part I -- Online Convex Optimization
Lecture 2 -- A gentle start [note][annotated]
Lecture 3 -- Convexity and Follow-the-Leader [note][annotated]
Lecture 4 -- Follow-the-Regularized-Leader [note][annotated]
Lecture 5 -- Online Gradient Descent [note][annotated]
Lecture 6 -- Strongly Convex Regularizers [note][annotated]
Lecture 7 -- Online Mirror Descent (OMD) [note][annotated]
Lecture 8 -- Online Mirror Descent via Duality [note][annotated]
Lecture 9 -- Lazy and Active OMD [note][annotated]
Lecture 10 -- More on Active OMD and Local Norm [note][annotated]
Lecture 11 -- Regret Lower Bound [note][annotated]
Part II -- Bandits
Lecture 12 -- Adversary Multi-Armed Bandits (MAB) [note][annotated]
Lecture 13 -- Stochastic MAB [note][annotated]
Lecture 14 -- MAB Algorithms [note][annotated]
Lecture 15 -- Upper Confidence Bound (UCB) Algorithm [note][annotated]
Lecture 16 -- Lower Bound for MAB [note][annotated]
Lecture 17 -- Linear Bandits [note][annotated]
Lecture 18 -- LinUCB (OFUL) [note][annotated]
Lecture 19 -- Intro to Gaussian Process [note][annotated]
Lecture 20 -- Gaussian Process Bandits [note][annotated]
Lecture 21 -- Bayesian Optimization [note][annotated]
Part III -- Reinforcement Learning
Lecture 22 -- Intro to RL and MDP Basics [note][annotated]
Lecture 23 -- MDP Planning [note][annotated]
Lecture 24 -- RL Algorithms [note][annotated]
Lecture 25 -- Lower Bound and Linear MDPs [note][annotated]
Final Review
Lecture 26 -- Course Summary [note][annotated]
Assignments
Reference Materials
Online Convex Optimization
Online Learning and Online Convex Optimization, by Shai Shalev-Shwartz [online access]
A Modern Introduction to Online Learning, by Francesco Orabona [online access]
Bandits Learning
Introduction to Multi-Armed Bandits, by Aleksandrs Slivkins [online access]
Bandit Algorithms, by Tor Lattimore and Csaba Szepesvari [online access]
Reinforcement Learning
Reinforcement Learning: Theory and Algorithms, by Alekh Agarwal Nan Jiang Sham M. Kakade Wen Sun [online access]
Reinforcement Learning: An Introduction, by Richard S. Sutton and Andrew G. Barto [online access]
Reference Courses
Online Prediction and Learning, by Aditya Gopalan [course website]
Foundations of Reinforcement Learning, by Chi Jin [course website]
Online and Adaptive Methods for Machine Learning, by Kevin Jamieson [course website]