Risk analytics and Robust Optimization - Graduate Course: Online Decision-Making

Online Decision Making: The utility of Fluid Heuristics & Optimism in the face of uncertainty

Topics to be covered this six week course

An introduction to Markov Decision Processes
Dynamic Resource-Constrained Reward Collection Problems
Online Allocation problems
Multi-Armed Bandits & Bandits with Knapsacks
Contextual Bandits & Contextual Bandits with Knapsacks
Operationalising the principle of optimism in broader Reinforcement Learning problems

References:

Monograph by Alekh Agarwal, Nan Jiang, Sham Kakade, Wen Sun

Journal article by Santiago R. Balseiro, Omar Besbes, Dana Pizarro

Operations Research (In Press)

Journal article by Santiago R. Balseiro, Haihao Lu, Vahab Mirrokni

Operations Research 71(1), pp. 101-119

Monograph by Dylan J. Foster and Alexander Rakhlin

Lecture slides

Week 2: Dynamic Resource-Constrained Reward-Collection Problems: Introduction, Examples, Fluid approximation, and the CE heuristic (slides)

Week 3: The fluid approximation and the CE heuristic in problems (slides)

Please note: These slides are just used as teaching aid for this class. They are not meant to be reproduced