# ECE7995: Online Decision Making

(Fall 2021)

Instructor: Xingyu Zhou, Wayne State Unviersity.

Contents: Introduction to mathematical foundations of online learning, including online convex optimization, bandits, and reinforcement learning, mostly about theorems and proofs. Meanwhile, it also covers basic Python implementations of commonly used algorithms.

Grades: 3 homework assignments and 1 final project

**Lecture Notes **(* notes may contain various typos or errors. Feel free to let me know:-))

Lecture 1 -- Intro [note][annotated]

**Part I -- Online Convex Optimization**

Lecture 2 -- A gentle start [note][annotated]

Lecture 3 -- Convexity and Follow-the-Leader [note][annotated]

Lecture 4 -- Follow-the-Regularized-Leader [note][annotated]

Lecture 5 -- Online Gradient Descent [note][annotated]

Lecture 6 -- Strongly Convex Regularizers [note][annotated]

Lecture 7 -- Online Mirror Descent (OMD) [note][annotated]

Lecture 8 -- Online Mirror Descent via Duality [note][annotated]

Lecture 9 -- Lazy and Active OMD [note][annotated]

Lecture 10 -- More on Active OMD and Local Norm [note][annotated]

Lecture 11 -- Regret Lower Bound [note][annotated]

**Part II -- ****Bandits**

Lecture 12 -- Adversary Multi-Armed Bandits (MAB) [note][annotated]

Lecture 13 -- Stochastic MAB [note][annotated]

Lecture 14 -- MAB Algorithms [note][annotated]

Lecture 15 -- Upper Confidence Bound (UCB) Algorithm [note][annotated]

Lecture 16 -- Lower Bound for MAB [note][annotated]

Lecture 17 -- Linear Bandits [note][annotated]

Lecture 18 -- LinUCB (OFUL) [note][annotated]

Lecture 19 -- Intro to Gaussian Process [note][annotated]

Lecture 20 -- Gaussian Process Bandits [note][annotated]

Lecture 21 -- Bayesian Optimization [note][annotated]

**Part III -- ****Reinforcement Learning**

Lecture 22 -- Intro to RL and MDP Basics [note][annotated]

Lecture 23 -- MDP Planning [note][annotated]

Lecture 24 -- RL Algorithms [note][annotated]

Lecture 25 -- Lower Bound and Linear MDPs [note][annotated]

**Final Review**

Lecture 26 -- Course Summary [note][annotated]

**Assignments**

**Reference Materials**

Online Convex Optimization

Online Learning and Online Convex Optimization, by Shai Shalev-Shwartz [online access]

A Modern Introduction to Online Learning, by Francesco Orabona [online access]

Bandits Learning

Introduction to Multi-Armed Bandits, by Aleksandrs Slivkins [online access]

Bandit Algorithms, by Tor Lattimore and Csaba Szepesvari [online access]

Reinforcement Learning

Reinforcement Learning: Theory and Algorithms, by Alekh Agarwal Nan Jiang Sham M. Kakade Wen Sun [online access]

Reinforcement Learning: An Introduction, by Richard S. Sutton and Andrew G. Barto [online access]

**Reference ****Courses**

Online Prediction and Learning, by Aditya Gopalan [course website]

Foundations of Reinforcement Learning, by Chi Jin [course website]

Online and Adaptive Methods for Machine Learning, by Kevin Jamieson [course website]