ECE539/COS512 (2021 Ver.)

ELE539/COS512: Optimization for Machine Learning

Spring 2021, Mon. Wed. 09:30 am - 10:50 am

Instructor: Chi Jin Office hour: Mon. 4:00 pm - 5:00 pm
TA: Qinghua Liu Office hour: Tue. 4:00 pm - 5:00 pm
Contents: Optimization theory---algorithms and complexity analyses
Grades: 5 problem sets (60%), 1 scribe note (10%) and 1 final exam (30%). Final will take 40% for students who have not done scribe note.
No late homework.
Scribe note [sign up sheet] and [template].
*Special thanks to Tianyi Lin for his enormous help in developing the course material

Convex Optimization

2/1. Introduction. [Jamboard][video][note] (Ch 1.1, 1.3 in Bubeck)
2/3. Black-box model, basic properties of convexity. [Jamboard][video][note] (Ch 1.2, 1.4 in Bubeck)
2/8. The ellipsoid method. [Jamboard][video][note] (Ch 2.2 in Bubeck)
2/10. Gradient descent for Lipschitz functions. [Jamboard][video][note] (Ch 3.1, 3.4.1 in Bubeck)
2/15. Gradient descent for smooth functions. [Jamboard][video][note] (Ch 3.2, 3.4.2 in Bubeck)
2/17. Nesterov's accelerated gradient descent. [Jamboard][video][note] (Ch 3.7 in Bubeck)
2/22. Lower Bounds: linear span algorithms. [Jamboard][video][note] (Ch 3.5 in Bubeck)
2/24. Lower Bounds II: general randomized algorithms. [Jamboard][video][note] [Woodworth & Srebro, 2017]
3/1. Mirror Descent I. [Jamboard][video] (Ch 4.1 in Bubeck)
3/3. Mirror Descent II. [Jamboard][video] (Ch 4.2, 4.3 in Bubeck)
3/8. Stochastic Gradient Descent. [Jamboard][video][note] (Ch 6.1, 6.2 in Bubeck)
3/10. Stochastic Variance Reduced Gradient. [Jamboard][video][note] [Johnson & Zhang, 2013]

Nonconvex Optimization

3/17. Nonconvex optimization overview, power methods. [Jamboard][video][note]
3/22. Lanczos Algorithm. [Jamboard][video][note]
3/24. Finding stationary points. [Jamboard][video][note]
3/29. Escaping saddle points. [Jamboard][video][note]
3/31. Escaping saddle points II. [Jamboard][video][note] [Jin et al., 2019]
4/5. Examples, optimal rates, high-order saddle points. [Jamboard][video][note]
4/7. Lecture cancelled.
4/12. (Guest lecture by Cong Fang) stochastic nonconvex optimization, SPIDER. [Jamboard][video][note] [Fang et al., 2018]
4/14. Nonsmooth nonconvex optimization [Jamboard][video][note] [Davis & Drusvyatskiy, 2018]

Minimax Optimization

4/19. Minimax Optimization, gradient descent ascent for Lipschitz functions. [Jamboard][video][note]
4/21. Gradient descent ascent for smooth functions. [Jamboard][video][note]
4/26. Extragradient method, last iterate convergence, and optimal rates. [Jamboard][video][note]
4/28. Nonconvex minimax optimization. [Jamboard][video][note]

Homework 1 (due on Mar. 3, 11:59 PM on Blackboard)
Homework 2 (due on Mar. 16, 11:59 PM on Blackboard)
Homework 3 (due on Mar. 30, 11:59 PM on Blackboard)
Homework 4 (due on Apr. 22, 11:59 PM on Blackboard)
Homework 5 (due on May 4, 11:59 PM on Blackboard)

Convex Optimization:

Nonconvex Optimization

Minimax Optimization

Last iterate analysis on GD/SGD for Lipschitz convex functions: [Shamir & Zhang, 2013].
Optimal algorithms in the stochastic gradient setting for strongly convex and smooth functions: [Ghadimi & Lan, 2013a] [Ghadimi & Lan, 2013b]
Optimal algorithms for the finite-sum problems and the matching lower bounds: [Lin et al., 2018] [Allen-Zhu, 2017] [Woodworth & Srebro, 2016]
OGDA and Extragradient for convex-concave optimization: [Mokhtari et al., 2019a][Mokhtari et al., 2019b]
Optimal algorithm for strongly-convex-strongly-concave optimization: [Lin et al., 2020]