Optimization Techniques (M. Stat. 1 year, 2021)

Instructor Arijit Ghosh

Status Completed (The final class was on 12 May 2021)

Description To study the basics of Mathematical Programming.

Prerequisites Mathematical maturity of a finishing undergraduate student in mathematical sciences.

Class timings Mondays and Fridays 16:10 - 18:05 hrs.

Syllabus

Introduction to Mathematical Programming
Linear Programming (LP), Integer Linear Programming (ILP), and Semidefinite Programming (SDP)
Basic properties of LP
Basics of discrete convex geometry (Helly's Theorem and its relatives) and its applications
Vertices and extreme points of convex sets
Structural results on polytopes and polyhedra
Strong (and weak) duality of Linear Programming and its applications
The Simplex Method
Applications of LP
Support Vector Machine and Classification Problems
Perceptron Algorithm
A brief introduction to Complexity Theory: NP-completeness, NP-hardness, Approximation algorithms, Hardness of approximation, etc.
Rounding LP relaxations of ILP to get integral solutions
Lagrangian dual function and Lagrangian dual problem
Basics of convex geometry and convex functions
Convex Programming
Descent Methods
Center of Gravity Method
The Ellipsoid Method
Interior Point Method (not covered due to lack of time)
Introduction to Conic programming (not covered due to lack of time)
SDP and its applications (not covered due to lack of time)
Submodular optimization (not covered due to lack of time)
Multiplicative Weight Update method (not covered due to lack of time)
Online convex optimization (not covered due to lack of time)
Mirror Descent (not covered due to lack of time)
Stochastic Gradient Descent (not covered due to lack of time)

References

[AS16] Noga Alon and Joel Spencer, The Probabilistic Method, 4th Edition, Wiley, 2016.
[BO97] Imre Barany and Shmuel Onn, Caratheodory's Theorem: Colourful and Applicable, Bolyai Society Mathematical Studies, 6:11 - 21, 1997.
[B10] Alexander Barvinok, Combinatorics of Polytopes, Lecture Notes, Univ. Michigan, 2010.
[BT97] Dimitris Bertsimas and John N. Tsitsiklis, Introduction to Linear Optimization, Athena Scientific, Belmont, Massachusetts, 1997.
[BV04] Stephen Boyd and Lieven Vandenberghe, Convex Optimization, Cambridge University Press, 2004.
[B15] Sébastien Bubeck, Convex Optimization: Algorithms and Complexity, Foundations and Trends in Machine Learning, 2015.
[K06] Vladlen Koltun, Advanced Geometric Algorithms, Lecture Notes, Stanford University, 2006.
[LGM19] Jesus de Loera, Xavier Goaoc, Frederic Meunier and Nabil Mustafa, The Discrete Yet Ubiquitous Theorems of Caratheodory, Helly, Sperner, Tucker and Tverberg, Bulletin of the American Mathematical Society, 56: 415-511, 2019.
[LY08] David G. Luenberger and Yinyu Ye, Linear and Nonlinear Programming, 3rd Edition, Springer, 2008.
[MG07] Jiří Matoušek and Bernd Gärtner, Understanding and Using Linear Programming, Springer, 2007.
[M02] Jiří Matoušek, Lectures on Discrete Geometry, Springer, 2002.
[R97] R. Tyrrell Rockafellar, Convex Analysis, Princeton University Press, 1997.
[S93] Rolf Schneider, Convex Bodies: The Brunn-Minkowski Theory, Cambridge University Press, 1993.
[SB18] Shai Shalev-Shwartz and Shai Ben-David, Understanding Machine Learning: From Theory to Algorithms, Cambridge Univ. Press, 2018.
[MU05] Michael Mitzenmacher and Eli Upfal, Probability and Computing, Cambridge University Press, 2005.
[WS11] David P. Williamson and David B. Shmoys, The Design of Approximation Algorithms, Cambridge University Press, 2011.
[TB97] Lloyd N. Trefethen and David Bau III, Numerical Linear Algebra, SIAM, 1997.
[V03] Vijay Vazirani, Approximation Algorithms, Springer, 2003.
[Y18] Yurii Nesterov, Lectures on Convex Optimization, Springer, 2018.
[Z95] Günter M. Ziegler, Lectures on Polytopes, Springer-Verlag, 1995.

Lecture details

Topics covered in Lectures 1 (25 Jan'21)

Introduction to Mathematical programming
Introduction to Linear Programming and Integer Linear Programming
Linear Programming formulation of the largest ball inside a Polytope Problem
Vertex Cover: ILP formulation, LP relaxation, and 2-factor approximation via rounding
Greedy 2-factor approximation scheme to Vertex Cover problem
Reading materials:
- - Chapter 1 from [LY08]
  - Chapter 1 from [MG07]
  - Section 2.1 from [V03]
  - Sections 2.5, 3.3 and 3.4 from [MG07]

Topics covered in Lectures 2 (29 Jan'21)

Introduction to Unconstrained Optimization
Some applications of Unconstrained Optimization:
- - Least square problems
  - Simple regression
  - Geometric median problem
  - Minimum enclosing ball problem
Introduction to Convex Programming
Some applications of Convex Programming:
- - Linear Programming
  - Linear regression with L0, L1, and L2 penalties
  - Support vector machine (SVM)
Reading materials:
- - Chapter 1 from [BV04]
  - Chapter 1 from [LY08]
  - Section 8.6 from [BV04]
  - Wikipedia entry on Linear regression: https://en.wikipedia.org/wiki/Linear_regression
  - Wikipedia entry on SVM: https://en.wikipedia.org/wiki/Support-vector_machine

Topics covered in Lectures 3 (1 Feb'21)

Equational form of LP
Procedure/algorithm for converting LP into an equivalent Equational form
Basic feasible solutions (BFS)
Simple properties of BFS
Fundamental Theorem of Linear Programming (version 1)
References:
- - Chapter 2 (except Section 2.5) from [LY08]
  - Sections 4.1 and 4.2 from [MG07]

Topics covered in Lecture 4 (5 Feb'21)

Fundamental Theorem of Linear Programming (version 2)
Vertices and extreme points of convex sets
Equivalence of BFS, extreme points, and vertices of LPs in equational form
Reading materials:
- - Chapter 4 of [MG07]
  - Chapter 2 of [LY08]

Topics covered in Lecture 5 (8 Feb'21)

Affine combination and affine independence
Convex set and convex hull
An equivalent definition of convex hulls
Radon's theorem
Helly's theorem
Finite intersection property, and infinite version of Helly's theorem
Definition of the center point, and the statement of Rado's theorem
Reading materials:
- - Chapter 1 from [K06]
  - Chapter 1 from [M02]
  - http://www.math.utah.edu/~treiberg/HellySlides.pdf

Topics covered in Lecture 6 (12 Feb'21)

Rado's center theorem
Carathéodory's theorem
Revisiting infinite version of Helly's theorem
Reading materials:
- - Chapter 1 from [K06]
  - Chapter 1 from [M02]
  - http://www.math.utah.edu/~treiberg/HellySlides.pdf

Topics covered in Lecture 7 (15 Feb'21)

Equivalence of BFS, extreme points, and vertices of polyhedrons
Definition of a face of a polyhedron
Extreme points of a face
Bounded polyhedrons are polytopes
Reading materials:
- - Chapter 2 of [LY08]
  - Chapter 2 from [BT97]
  - Chapter 2 from [Z95]

Topics covered in Lecture 8 (19 Feb'21)

Hyperplane Separation Theorem (point and convex body case)
Polar of a set and its properties
Polytopes are bounded polyhedron via polar of a convex set
Reading materials:
- - Chapter 2 from [BV04]
  - Chapter 2 from [BT97]
  - Chapter 2 from [Z95]

Topics covered in Lecture 9 (22 Feb'21)

Convex cone
Recession cone of a Polyhedron
Minkowski sum of two sets
Pointed Polyhedrons
Weyl-Minkowski Theorem for Polyhedrons
Fundamental Theorem for Linear Programming for Pointed Polyhedrons (version 3)
Reading:
- - Chapters 1 and 2 from [Z95]
  - Chapter 9 from [B10]

Topics covered in Lecture 10 (26 Feb'21)

Revisiting convex cone
Weak Duality for Linear Programming
Strong Alternatives from Gaussian Elimination
Farkas Lemma (variant 1) via Hyperplane Separation Theorem (point and convex body case)
Farkas Lemma (variant 2)
Strong Duality for Linear Programming using Farkas Lemma (variant 2)
Reading:
- - Chapter 6 (except Section 6.6) from [MG07]
  - Chapter 4 (except Sections 4.5, 4.6, and 4.7) from [LY08]
  - Chapter 4 from [BT97]

Topics covered in Lecture 11 (27 Feb'21)

Revisiting Farkas Lemma (variant 2)
Equivalence between Strong Duality of LP and Farkas Lemma
Complementary slackness and optimality
Clark's Theorem via Farkas Lemma (variant 3)
Reading:
- - Chapter 6 (except Section 6.6) from [MG07]
  - Chapter 4 (except Sections 4.5, 4.6, and 4.7) from [LY08]
  - Chapter 4 from [BT97]

Topics covered in Lectures 12 and 13 (1 March'21 and 5 March'21)

Introduction to Simplex Method via an example
The general mathematical framework behind the Simplex method
Things not yet covered:
- - Finding the starting BFS
  - Different pivot rules
  - Klee-Minty cubes
Reading:
- - Chapter 3 from [BT97]
  - Chapter 5 from [MG07]
  - Chapter 3 from [LY08]

Topics covered in Lectures 14 and 15 (2 April'21)

Recap Simplex Method
Proof of correctness
Finding the initial (BFS) to start the simplex method
Reading:
- - Chapter 3 from [BT97]
  - Chapter 5 from [MG07]
  - Chapter 3 from [LY08]

Topics covered in Lecture 16 (3 April'21)

Finding the initial (BFS) to start the simplex method
Different pivoting rules for the simplex method
Cycling in simplex method
Bland's rule for selecting a pivot
Proof of termination of Simplex method using Bland's rule
Reading:
- - Chapter 3 from [BT97]
  - Chapter 5 from [MG07]
  - Chapter 3 from [LY08]

Topics covered in Lecture 17 (5 April'21)

Lagrangian method
Dual function and its properties
Lagrange dual program
Weak duality
Complementary slackness of general mathematical programs
KKT conditions
Reading:
- - Chapter 5 from [BV04]

Topics covered in Lecture 18 (7 April'21)

Statement of hyperplane separation of convex sets
Statement of the first-order characterization of convex functions
KKT conditions imply strong duality for convex programs
Slater's conditions and strong duality of convex programs
Reading:
- - Chapter 5 from [BV04]

Topics covered in Lecture 19 (12 April'21)

Basic properties of convex functions
Statements of first-order and second-order characterizations of convex functions
Convexity preserving rules for convex functions
Basic properties of convex optimization programs
Characterization of optimal solutions to convex programs
Reading:
- - Chapter 2 from [BV04] (except Sections 2.4 and 2.6)
  - Sections 3.1 and 3.2 in Chapter 3 from [BV04]
  - Chapter 7 from [LY08]

Topics covered in Lecture 20 (18 April'21)

Convex programs on affine space
Maximizing convex functions over convex domains
Proofs of first-order and second-order characterizations of convex functions
Reading:
- - Chapter 2 from [BV04] (except Sections 2.4 and 2.6)
  - Sections 3.1 and 3.2 in Chapter 3 from [BV04]
  - Chapter 7 from [LY08]

Topics covered in Lecture 21 (19 April'21)

SVM through the lens of Farkas Lemma and Lagrangian Method
Readings:
- - Section 8.6 in Chapter 8 from [BV04]

Topics covered in Lecture 22 (26 April'21)

Revisiting SVM, and separating sets using low degree polynomials
Proof of supporting hyperplane theorem for convex bodies
Proof of hyperplane separating theorem for convex bodies
Readings:
- - Section 8.6 in Chapter 8 from [BV04]
  - Chapter 2 from [BV04]

Topics covered in Lecture 23 (28 April'21)

Introduction to subgradients
Center of gravity algorithm
Readings:
- - Chapter 1 from [B15]
  - Section 2.1 in Chapter 2 from [B15]
Additional readings:
- - Section 6.7 in Chapter 6 from [B15]
  - Bertsimas and Vempala's paper

Topics covered in Lecture 24 (30 April'21)

Existence of subgradients for convex functions
Definitions and properties of the following functions:
- - L-Lipschitz function
  - Beta-smooth function
  - Alpha-strongly convex function
Projected subgradient method
Rate of convergence of projected subgradient method (simple case)
Readings:
- - Chapters 1 and 3 from [B15]

Topics covered in Lecture 25 (7 May'21)

Some more properties of beta-smooth convex functions
Introduction to gradient descent method
Convergence of gradient descent method for beta-smooth convex functions
Introduction to Frank-Wolfe method
Convergence of Frank-Wolfe method for beta-smooth function
Readings:
- - Chapter 3 from [B15]
Additional readings:
- - Chapter 9 from [BV04]

Topics covered in Lecture 26 (8 May'21)

Convergence of projected gradient method (PGM) for beta-smooth convex functions
Some more properties of alpha-strongly convex functions
Convergence of PGM for beta-smooth alpha-strongly convex functions
Convergence of gradient descent method for beta-smooth alpha-strongly convex functions
Readings:
- - Chapter 3 from [B15]
Additional readings:
- - Chapter 9 from [BV04]

Topics covered in Lecture 27 (12 May'21)

Separating oracle for convex sets
Ellipsoid algorithm and its convergence
Computational cost of Ellipsoid algorithm for solving LPs
Perceptron algorithm and its convergence
Readings:
- - Section 2.2 in Chapter 2 from [B15]
  - Section 9.1 in Chapter 9 from [SB18]