Course Information:
When: TTh 3:00pm - 4:15pm (EST)
Where: Lawson Computer Science Bldg 1106 Using zoom:
https://purdue-edu.zoom.us/j/9874334092?pwd=MVRVcHhjQXFac0VYUHcwUzdHK2tFUT09 (due to the pandemic)
Instructor: Pan Li
Office Hour: Tuesday. 3:30 pm- 4:30 pm, LWSN 3154L
TA: Shu Li, li3916@purdue.edu
Office Hour: Wednesday. 2:00 pm-3:00 pm, LWSN 3151#9
Content:
This course is a introductory course on optimization.
Topics include (tentative):
Applications: Least squares (example: stock market), Least squares with regularization (example: gene regulatory networks), SVM (example: image, text classification), neural networks (example: image classification, self-driving cars, speech recognition), power allocation in wireless networks, Internet congestion control, Netflix recommendation problem
Inf vs min, sup vs max, global vs local optimality, existence of optimal solutions and the Weierstrass extreme value theorem, convex sets, convex and concave functions, global and local maxima for convex functions, uniqueness for strictly convex functions, tests for convex/concave functions and strongly convex/concave functions using Hessian matrix, gradient-based characterization of convex and concave functions
First-order necessary and second-order sufficient conditions for local minimum
Gradient descent, descent lemma, convergence analysis for a fixed step size for functions with Lipschitz-continuous derivatives, for convex functions and for strongly convex functions, the role of the condition number and a quadratic objective example (least squares is a special case); variable step sizes, Armijo’s rule and its convergence; applications to neural networks and the back propagation algorithm
Newton’s method and improvement in convergence rate locally; problems with implementing Newton’s method and some fixes
Constrained optimization, projected gradient descent, and its convergence for a fixed step size
Constrained optimization with equality and inequality constraints; Lagrange multipliers, the KKT necessary conditions and the KKT sufficient conditions; application to power control in wireless networks and the waterfilling solution; sensitivity and Lagrange multipliers
Penalty, barrier function methods, and their convergence
Duality: the primal and dual problems; strong and weak duality and the Slater conditions, minimax characterization, linear programming and the economic interpretation of the dual; applications of strong duality to SVM
Subgradients, subgradient descent and its convergence
Proximal gradient descent and application to sparse regression (leads to an algorithm called ISTA for the regularized least squares problem)
Augmented Lagrangian methods (aka method of multipliers)
Prerequisites
Students are expected to have the following background:
Evaluation
Homework: 4% * 5 = 20%
Programming: 6% * 2 = 12%
Two midterms: 20% * 2 = 40%
One course project: 28% = 5% proposal (teaming, topic, reference) + 8% presentation (15 mins) + 15% report
Self-exercises: 3% (Bonus)
Materials
The most relevant course material:
Boyd & Vandenberghe. Convex Optimization. Cambridge University Press. 2003
D. Bertsekas. Nonlinear Programming, Athena Scientific, 2016.
Some of the course material is covered in for following books:
Ben-Tal & Nemirovski. Lectures on Optimization III: Convex Analysis, Nonlinear Programming Theory, Nonlinear Programming Algorithms, SIAM. 2013
Bertsekas, Nedich, & Ozdaglar. Convex Analysis and Optimization. Athena Scientific. 2003
Ben-Tal & Nemirovski. Lectures on Modern Convex Optimization, SIAM. 2011
Nesterov. Introductory Lectures on Convex Optimization: A Basic Course. Kluwer-Academic. 2003
Lectures
Lec1 Introduction
Lec2 Applications of Optimization (read matrix calculus)
Lec3 Matrix Calculus and Norms
Lec4 Basic concepts
Lec5 Convex set
Lec6 Convex func I
Lec8 Unconstrained OPT - Gradient Descent I
Lec9 Unconstrained OPT - Gradient Descent II
Lec10 Unconstrained OPT - Gradient Descent III
Lec11 Unconstrained OPT - Armijo's rule
Lec12 Unconstrained OPT - Condition Number and Newton
Lec13 Unconstrained OPT - Quasi-Newton
Lec14 Unconstrained OPT - Subgradient
Lec15 Unconstrained OPT - Backpropagation in Neural Networks
Lec16 Constrained OPT - Optimality conditions & Projected Gradient
Lec17 Constrained OPT - Lagrangian I - KKT conditions
Lec18 Constrained OPT - Lagrangian II - Sufficient conditions
Lec19 Constrained OPT - Duality
Lec20 Constrained OPT - Barrier and Penalty methods
Lec21 Constrained OPT - Augmented Lagrangian
03/03 --- Midterm I
03/15, 03/17 Spring Break
03/21(Mon.) --- Project proposal due
04/21 --- Midterm II
The following linked files are from the last-year course. They may get updated along with the course.
Lec1 Introduction and Applications of Optimization [typed-notes]
Lec2 Math Preliminaries
Lec3 Optimality conditions [typed-notes]
Lec4 Convex Set [typed-notes]
Lec5 Convex Function 1 [typed-notes]
Lec6 Convex Function 2 [typed-notes]
Lec7 Gradient Descent Basics and Descent Lemma [typed-notes]
Lec8 Gradient Descent Convergence I : Weak Assumption and Convex Assumption [typed-notes]
Lec9 Gradient Descent Convergence II : Strong Convexity , Convergence Rate [typed-notes]
Lec10 Gradient Descent Convergence III : Adaptive Stepsize and Acceleration [typed-notes]
Lec11 Application I: Neural Networks and Backpropagation [typed-notes]
Lec12 Condition Number and Newton's Method [typed-notes]
Lec13 Projected Gradient Method [typed-notes]
Lec14 Lagrange Multiplier with Equality Constraints [typed-notes]
Lec15 Lagrange Multiplier with Inequality Constraints and KKT Condition [typed-notes]
Lec16 Weak and Strong Duality [typed-notes]
Lec17 Barrier (interior point method) and Penalty Methods
Lec18 Application II: Support Vector Machine [typed-notes]
Lec19 Augmented Lagrangain Method [typed-notes]
Lec20 Frank-Wolfe Method [typed-notes]
Lec21 Subgradient Descent Method [typed-notes]
Lec22 Smoothing and Proximal Point Method
Lec23 Proximal Gradient Method
Finer written-lecture-notes (from 2021 Spring) due to Siyuan Xu! Thanks!
[part 1, Lec 1 - Lec 9]
[part 2, Lec 10 - Lec 13]
[part 3, Lec 14 - Lec 16]
[part 4, Lec 17 - Lec 20]
[part 5, Lec 21 - Lec 23]
Homeworks
[Homework 1] Math prelim, Optimal Condition, and Convex Set
[Homework 2] Convex Function
[Homework 3] Gradient Descent
[Homework 4] Newton's Method and Projected Gradient
[Homework 5] Lagrange Multipliers, KKT, and Duality
[Homework 6] Penalty Methods, Augmented Lagrangian Method, Subgradient Descent and Proximal Gradient