INDENG 290: Stochastic Optimization for Machine Learning
INDENG 290: Stochastic Optimization for Machine Learning
Course description: The purpose of this course is to provide graduate students with the fundamental understanding of the basic theory, models and algorithms for stochastic optimization, and with illustrations of how the theory and methods can be applied to many problems in contemporary machine learning and related disciplines. In the process, students will also learn about optimization under uncertainty and related analytical tools. Mathematical rigor is emphasized throughout the course. Students are expected to be interested in such rigor and willing to learn and practice it.
Prerequisites: Students should have completed a foundational course in optimization such as INDENG 262A or its equivalent. While not mandatory, students are strongly encouraged to concurrently enroll in INDENG 262B if they have not already completed it.
Lecture: (Spring 2025) Monday, Wednesday 10:00am - 11:30am, Etcheverry 1174
References:
A. Shapiro, D. Dentcheva, A. Ruszczynski. Lecture Notes on Stochastic Programming, Modeling and Theory. SIAM 2009.
J. Birge and F. Louveaux. Introduction to Stochastic Programming. Springer 2011.
F. Bach. Learning Theory from First Principles. MIT Press 2024.
J. Duchi. Introductory Lectures on Stochastic Optimization.
Outline:
Week 1:
Introduction to modeling uncertainty
Models for two-stage stochastic programs; Value of information
Week 2:
Decomposable structures of two-stage stochastic programs
Decomposition methods for coupling constraints: Lagrangian relaxation and progressive hedging
Week 3:
Decomposition methods for coupling variables: Benders decomposition
Week 4:
Sample average approximations (i): optimistic bias and asymptotic performance
Week 5:
Sample average approximations (ii): non-asymptotic performance and generalization error
Week 6:
Stochastic algorithms (i): convergence and sample complexity
Week 7:
Stochastic algorithms (ii): adaptive stepsizes and variance reduction
Week 8:
Stochastic algorithms (iii): geometry-aware design
Week 9:
Computer-assisted convergence analysis; performance estimation problems
Week 10:
Overparameterization (i)
Week 11:
Overparameterization (ii)
Week 12:
Discrete algorithms from continuous time dynamics
Week 13:
Sampling as optimization over probability spaces (i)
Week 14:
Sampling as optimization over probability spaces (ii)