Course Information:

When: TTh 3:00pm - 4:15pm (EST)

Where: Lawson Computer Science Bldg 1106 Using zoom:

https://purdue-edu.zoom.us/j/9874334092?pwd=MVRVcHhjQXFac0VYUHcwUzdHK2tFUT09 (due to the pandemic)

Instructor: Pan Li

Office Hour: Tuesday. 3:30 pm- 4:30 pm, LWSN 3154L

TA: Shu Li, li3916@purdue.edu

Office Hour: Wednesday. 2:00 pm-3:00 pm, LWSN 3151#9

Content:

This course is a introductory course on optimization.

Topics include (tentative):

  1. Applications: Least squares (example: stock market), Least squares with regularization (example: gene regulatory networks), SVM (example: image, text classification), neural networks (example: image classification, self-driving cars, speech recognition), power allocation in wireless networks, Internet congestion control, Netflix recommendation problem

  2. Inf vs min, sup vs max, global vs local optimality, existence of optimal solutions and the Weierstrass extreme value theorem, convex sets, convex and concave functions, global and local maxima for convex functions, uniqueness for strictly convex functions, tests for convex/concave functions and strongly convex/concave functions using Hessian matrix, gradient-based characterization of convex and concave functions

  3. First-order necessary and second-order sufficient conditions for local minimum

  4. Gradient descent, descent lemma, convergence analysis for a fixed step size for functions with Lipschitz-continuous derivatives, for convex functions and for strongly convex functions, the role of the condition number and a quadratic objective example (least squares is a special case); variable step sizes, Armijo’s rule and its convergence; applications to neural networks and the back propagation algorithm

  5. Newton’s method and improvement in convergence rate locally; problems with implementing Newton’s method and some fixes

  6. Constrained optimization, projected gradient descent, and its convergence for a fixed step size

  7. Constrained optimization with equality and inequality constraints; Lagrange multipliers, the KKT necessary conditions and the KKT sufficient conditions; application to power control in wireless networks and the waterfilling solution; sensitivity and Lagrange multipliers

  8. Penalty, barrier function methods, and their convergence

  9. Duality: the primal and dual problems; strong and weak duality and the Slater conditions, minimax characterization, linear programming and the economic interpretation of the dual; applications of strong duality to SVM

  10. Subgradients, subgradient descent and its convergence

  11. Proximal gradient descent and application to sparse regression (leads to an algorithm called ISTA for the regularized least squares problem)

  12. Augmented Lagrangian methods (aka method of multipliers)


Prerequisites

Students are expected to have the following background:

  • Knowledge of basic computer science principles, sufficient to write a reasonably non-trivial computer program (CS 15900, CS 18200, CS 25100);

  • Familiarity with the basic probability theory (MA 41600);

  • Familiarity with linear algebra (MA 26200, MA 26500);

  • Familiarity with multivariant calculus (MA 26100)

Evaluation

Homework: 4% * 5 = 20%

Programming: 6% * 2 = 12%

Two midterms: 20% * 2 = 40%

One course project: 28% = 5% proposal (teaming, topic, reference) + 8% presentation (15 mins) + 15% report

Self-exercises: 3% (Bonus)

Materials

The most relevant course material:

Some of the course material is covered in for following books:

Lectures

Lec1 Introduction

Lec2 Applications of Optimization (read matrix calculus)

Lec3 Matrix Calculus and Norms

Lec4 Basic concepts

Lec5 Convex set

Lec6 Convex func I

Lec7 Convex func II

Lec8 Unconstrained OPT - Gradient Descent I

Lec9 Unconstrained OPT - Gradient Descent II

Lec10 Unconstrained OPT - Gradient Descent III

Lec11 Unconstrained OPT - Armijo's rule

Lec12 Unconstrained OPT - Condition Number and Newton

Lec13 Unconstrained OPT - Quasi-Newton

Lec14 Unconstrained OPT - Subgradient

Lec15 Unconstrained OPT - Backpropagation in Neural Networks

Lec16 Constrained OPT - Optimality conditions & Projected Gradient

Lec17 Constrained OPT - Lagrangian I - KKT conditions

Lec18 Constrained OPT - Lagrangian II - Sufficient conditions

Lec19 Constrained OPT - Duality

Lec20 Constrained OPT - Barrier and Penalty methods

Lec21 Constrained OPT - Augmented Lagrangian

03/03 --- Midterm I

03/15, 03/17 Spring Break

03/21(Mon.) --- Project proposal due

04/21 --- Midterm II


The following linked files are from the last-year course. They may get updated along with the course.

Lec1 Introduction and Applications of Optimization [typed-notes]

Lec2 Math Preliminaries

Lec3 Optimality conditions [typed-notes]

Lec4 Convex Set [typed-notes]

Lec5 Convex Function 1 [typed-notes]

Lec6 Convex Function 2 [typed-notes]

Lec7 Gradient Descent Basics and Descent Lemma [typed-notes]

Lec8 Gradient Descent Convergence I : Weak Assumption and Convex Assumption [typed-notes]

Lec9 Gradient Descent Convergence II : Strong Convexity , Convergence Rate [typed-notes]

Lec10 Gradient Descent Convergence III : Adaptive Stepsize and Acceleration [typed-notes]

Lec11 Application I: Neural Networks and Backpropagation [typed-notes]

Lec12 Condition Number and Newton's Method [typed-notes]

Lec13 Projected Gradient Method [typed-notes]

Lec14 Lagrange Multiplier with Equality Constraints [typed-notes]

Lec15 Lagrange Multiplier with Inequality Constraints and KKT Condition [typed-notes]

Lec16 Weak and Strong Duality [typed-notes]

Lec17 Barrier (interior point method) and Penalty Methods

Lec18 Application II: Support Vector Machine [typed-notes]

Lec19 Augmented Lagrangain Method [typed-notes]

Lec20 Frank-Wolfe Method [typed-notes]

Lec21 Subgradient Descent Method [typed-notes]

Lec22 Smoothing and Proximal Point Method

Lec23 Proximal Gradient Method



Finer written-lecture-notes (from 2021 Spring) due to Siyuan Xu! Thanks!

[part 1, Lec 1 - Lec 9]

[part 2, Lec 10 - Lec 13]

[part 3, Lec 14 - Lec 16]

[part 4, Lec 17 - Lec 20]

[part 5, Lec 21 - Lec 23]

Homeworks

[Homework 1] Math prelim, Optimal Condition, and Convex Set

[Homework 2] Convex Function

[Homework 3] Gradient Descent

[Homework 4] Newton's Method and Projected Gradient

[Homework 5] Lagrange Multipliers, KKT, and Duality

[Homework 6] Penalty Methods, Augmented Lagrangian Method, Subgradient Descent and Proximal Gradient