Course: Introduction to Optimization
Course Information:
When: MW 4:30pm - 5:45pm (EST)
Where: Online (due to COVID-19)
Zoom: https://purdue-edu.zoom.us/j/9874334092?pwd=MVRVcHhjQXFac0VYUHcwUzdHK2tFUT09
Instructor: Pan Li
TA: Adarsh Barik, abarik@purdue.edu
Office Hour: Wed. 3:30pm-4:30pm, https://purdue-edu.zoom.us/j/99302728950
Content:
This course is a introductory course on optimization.
Courtesy warning: The lecture this year will emphasize more on the mathematical characterization while less on programming and implementation (though still some).
Topics include (tentative):
Applications: Least squares (example: stock market), Least squares with regularization (example: gene regulatory networks), SVM (example: image, text classification), neural networks (example: image classification, self-driving cars, speech recognition), power allocation in wireless networks, Internet congestion control, Netflix recommendation problem
Inf vs min, sup vs max, global vs local optimality, existence of optimal solutions and the Weierstrass extreme value theorem, convex sets, convex and concave functions, global and local maxima for convex functions, uniqueness for strictly convex functions, tests for convex/concave functions and strongly convex/concave functions using Hessian matrix, gradient-based characterization of convex and concave functions
First-order necessary and second-order sufficient conditions for local minimum
Gradient descent, descent lemma, convergence analysis for a fixed step size for functions with Lipschitz-continuous derivatives, for convex functions and for strongly convex functions, the role of the condition number and a quadratic objective example (least squares is a special case); variable step sizes, Armijo’s rule and its convergence; applications to neural networks and the back propagation algorithm
Newton’s method and improvement in convergence rate locally; problems with implementing Newton’s method and some fixes
Constrained optimization, projected gradient descent, and its convergence for a fixed step size
Constrained optimization with equality and inequality constraints; Lagrange multipliers, the KKT necessary conditions and the KKT sufficient conditions; application to power control in wireless networks and the waterfilling solution; sensitivity and Lagrange multipliers
Penalty, barrier function methods, and their convergence
Duality: the primal and dual problems; strong and weak duality and the Slater conditions, minimax characterization, linear programming and the economic interpretation of the dual; applications of strong duality to SVM
Subgradients, subgradient descent and its convergence
Proximal gradient descent and application to sparse regression (leads to an algorithm called ISTA for the regularized least squares problem)
Augmented Lagrangian methods (aka method of multipliers)
Prerequisites
Students are expected to have the following background:
Evaluation
Homework: 4% * 9 = 36%
Programming: 12% * 2 = 24%
Scripting: 3% (bonus)
Two midterms (20% * 2 = 40%)
Materials
The most relevant course material:
Boyd & Vandenberghe. Convex Optimization. Cambridge University Press. 2003
D. Bertsekas. Nonlinear Programming, Athena Scientific, 2016.
Some of the course material is covered in for following books:
Ben-Tal & Nemirovski. Lectures on Optimization III: Convex Analysis, Nonlinear Programming Theory, Nonlinear Programming Algorithms, SIAM. 2013
Bertsekas, Nedich, & Ozdaglar. Convex Analysis and Optimization. Athena Scientific. 2003
Ben-Tal & Nemirovski. Lectures on Modern Convex Optimization, SIAM. 2011
Nesterov. Introductory Lectures on Convex Optimization: A Basic Course. Kluwer-Academic. 2003
Lecture Notes
01/20 Lec1 Introduction and Applications of Optimization [typed-notes]
01/25 Lec2 Math Preliminaries (read matrix calculus)
01/27 Lec3 Optimality conditions [typed-notes]
02/01 Lec4 Convex Set [typed-notes]
02/03-02/08 Lec5-6: Convex Function [typed-notes1 ] [typed-notes2]
02/10 Lec7: Gradient Descent Basics and Descent Lemma [typed-notes]
02/15 Lec8: Gradient Descent Convergence I : Weak Assumption and Convex Assumption [typed-notes]
02/20 ---
02/22 Lec9: Gradient Descent Convergence II : Strong Convexity , Convergence Rate [typed-notes]
02/24 Lec10: Gradient Descent Convergence III : Adaptive Stepsize and Acceleration [typed-notes]
03/01 Lec11: Application I: Neural Networks and Backpropagation [typed-notes]
03/03 Lec12: Condition Number and Newton's Method [typed-notes]
03/08 --- Midterm I
03/10 Lec13: Projected Gradient Method [typed-notes]
03/15 Lec14: Lagrange Multiplier with Equality Constraints [typed-notes]
03/17 - 03/22 Lec15: Lagrange Multiplier with Inequality Constraints and KKT Condition [typed-notes]
03/24 Lec16: Weak and Strong Duality [typed-notes]
03/29 - 03/31 Lec17: Barrier (interior point method) and Penalty Methods
04/05 Lec18: Application II: Support Vector Machine [typed-notes]
04/07 Lec19: Augmented Lagrangain Method [typed-notes]
04/12 Lec20: Subgradient Descent Method [typed-notes]
04/14 Lec21: Frank-Wolfe Method [typed-notes]
04/19 Lec22: Smoothing and Proximal Point Method
04/21 Lec23: Proximal Gradient Method
04/26 Lec24: Guest Lecture: Learning to Optimize --- Dr. Atlas Wang
04/28 --- Midterm II
Finer written-lecture-notes due to Siyuan Xu! Thanks!
[part 1, Lec 1 - Lec 9]
[part 2, Lec 10 - Lec 13]
[part 3, Lec 14 - Lec 17 first]
[part 4, Lec 17 second - Lec 20]
[part 5, Lec 21 - Lec 23]
Homeworks
[Homework 1] Math prelim and Optimal Condition. Due: 02/06;
[Homework 2] Convex Set and Convex Function I. Due: 02/13;
[Homework 3] Convex Function II and Descent Lemma. Due: 02/20;
[Homework 4] Convergence of Gradient Descent. Due: 03/06;
[Homework 5] Newton's Method and Projected Gradient. Due: 03/20;
[Homework 6] Lagrange Multipliers and KKT. Due: 03/27;
[Homework 7] KKT II and Strong Duality. Due: 04/03
[Homework 8] Penalty Methods and Augmented Lagrangian Method. Due: 04/17
[Homework 9] Subgradient Descent and Proximal Gradient. Due: 04/28