Optimization Techniques (M.Tech CS, 2021-22)

Instructor Arijit Ghosh

Status Completed (Final lecture was on 23 Jan 2022)

Description To study the basics of Mathematical Programming

Prerequisites Mathematical maturity of a finishing undergraduate student in mathematical sciences

Class timings Mondays 9:25 - 11:10 hrs and Thursdays 11:15 - 13:00 hrs

Syllabus

Introduction to Mathematical Programming
Linear Programming (LP), Integer Linear Programming (ILP), and Semidefinite Programming (SDP)
Basic properties of LP
Basics of discrete convex geometry (Helly's Theorem and its relatives) and its applications
Vertices and extreme points of convex sets
Structural results on polytopes and polyhedra
Strong (and weak) duality of Linear Programming and its applications
Unimodular Matrix
Applications of LP
Support Vector Machine and Classification Problems
The Simplex Method
Perceptron Algorithm (not covered due to lack of time)
Fourier–Motzkin Elimination Method (not covered due to lack of time)
A brief introduction to Complexity Theory: NP-completeness, NP-hardness, Approximation algorithms, Hardness of approximation, etc.
Rounding LP relaxations of ILP to get integral solutions
Lagrangian dual function and Lagrangian dual problem
Basics of convex geometry and convex functions
Convex Programming
Descent Methods
Gradient Descent Method
Projected Gradient Descent Method
Conditional Gradient Descent Method
Center of Gravity Method (not covered due to lack of time)
The Ellipsoid Method
Interior Point Method (not covered due to lack of time)
Introduction to Conic programming (not covered due to lack of time)
SDP and its applications (not covered due to lack of time)
Submodular Optimization (not covered due to lack of time)
Multiplicative Weight Update method (not covered due to lack of time)
Online convex optimization (not covered due to lack of time)
Mirror Descent (not covered due to lack of time)
Stochastic Gradient Descent (not covered due to lack of time)

References

[AS16] Noga Alon and Joel Spencer, The Probabilistic Method, 4th Edition, Wiley, 2016
[ALS15] Nina Amenta, Jesús A. De Loera, and Pablo Soberón, Helly's Theorem: New Variations and Applications, 2015
[BO97] Imre Barany and Shmuel Onn, Caratheodory's Theorem: Colourful and Applicable, Bolyai Society Mathematical Studies, 6:11 - 21, 1997
[B10] Alexander Barvinok, Combinatorics of Polytopes, Lecture Notes, Univ. Michigan, 2010
[BT97] Dimitris Bertsimas and John N. Tsitsiklis, Introduction to Linear Optimization, Athena Scientific, 1997
[BV04] Stephen Boyd and Lieven Vandenberghe, Convex Optimization, Cambridge University Press, 2004
[B15] Sébastien Bubeck, Convex Optimization: Algorithms and Complexity, Foundations and Trends in Machine Learning, 2015
[FV14] Robert M. Freund and Jorge Vera, Fundamentals of Nonlinear Optimization: A Constructive Approach, book in progress
[H18] Moritz Hardt, Convex Optimization and Approximation, Lecture Notes, UC Berkeley, 2018
[K06] Vladlen Koltun, Advanced Geometric Algorithms, Lecture Notes, Stanford University, 2006
[LGM19] Jesus de Loera, Xavier Goaoc, Frederic Meunier and Nabil Mustafa, The Discrete Yet Ubiquitous Theorems of Caratheodory, Helly, Sperner, Tucker and Tverberg, Bulletin of the American Mathematical Society, 56: 415-511, 2019
[LY08] David G. Luenberger and Yinyu Ye, Linear and Nonlinear Programming, 3rd Edition, Springer, 2008
[MG07] Jiří Matoušek and Bernd Gärtner, Understanding and Using Linear Programming, Springer, 2007
[M02] Jiří Matoušek, Lectures on Discrete Geometry, Springer, 2002
[R97] R. Tyrrell Rockafellar, Convex Analysis, Princeton University Press, 1997
[S93] Rolf Schneider, Convex Bodies: The Brunn-Minkowski Theory, Cambridge University Press, 1993
[SB18] Shai Shalev-Shwartz and Shai Ben-David, Understanding Machine Learning: From Theory to Algorithms, Cambridge Univ. Press, 2018
[MU05] Michael Mitzenmacher and Eli Upfal, Probability and Computing, Cambridge University Press, 2005
[WS11] David P. Williamson and David B. Shmoys, The Design of Approximation Algorithms, Cambridge University Press, 2011
[TB97] Lloyd N. Trefethen and David Bau III, Numerical Linear Algebra, SIAM, 1997
[V21] Nisheeth K. Vishnoi, Algorithms for Convex Optimization, Cambridge University Press, 2021
[V03] Vijay Vazirani, Approximation Algorithms, Springer, 2003
[Y18] Yurii Nesterov, Lectures on Convex Optimization, Springer, 2018
[Z95] Günter M. Ziegler, Lectures on Polytopes, Springer-Verlag, 1995

Lecture details

Topics covered in Lecture 1 (4 Oct.'21)

Course introduction
Introduction to Mathematical Programming
Introduction to Unconstrained Optimization
Some applications of Unconstrained Optimization:
- - Facility Location problem and Geometric Median problem
  - Minimum Enclosing Ball problem
  - Least Squares problem
Introduction to Linear Programs (LP) and Integer Linear Programs (ILP)
Some definitions used in LPs were discussed:
- - Hyperplanes and halfspaces
  - Convex sets
Reading materials:
- - Chapter 1 from [BV04]
  - Chapter 1 from [LY08]
  - Chapter 1 from [MG07]

Topics covered in Lecture 2 (7 Oct.'21)

Recall some definitions related to Mathematical Programming
Feasible set, feasible solution, and feasibility question
Equivalence between solving LP and LP feasibility question
Reducing feasibility question to Unconstrained Optimization question
Vertex Cover (VC) problem
Maximal matching, and a greedy algorithm to construct a maximal matching
2-factor approximation to VC problem using maximal matching
Reading materials:
- - Chapter 1 from [BV04]
  - Chapter 1 from [LY08]
  - Chapter 1 from [V03]

Topics covered in Lecture 3 (18 Oct.'21)

ILP formulation for Vertex Cover (VC) problem
LP relaxation to the above ILP
An alternative 2-factor approximation algorithm to VC via LP relaxation and rounding
Introduction to Convex Programming
Reading materials:
- - Chapter 1 from [LY08]
  - Chapter 1 from [MG07]
  - Section 2.1 from [V03]
  - Section 3.3 from [MG07]

Topics covered in Lecture 4 (25 Oct.'21)

Linear Regression
Linear Regression with penalties
Support Vector Machine (SVM)
LP formulation of Linear Classifier
Robust Linear Classifier and its different programming formulations
Reading materials:
- - Chapter 1 from [BV04]
  - Chapter 1 from
  - Section 8.6 from [BV04]
  - Wikipedia entry on Linear Regression
  - Wikipedia entry on SVM
  - Wikipedia entry on LASSO

Topics covered in Lecture 5 (28 Oct.'21)

Revisiting Robust Linear Classifier
Equational form of LP
Procedure/algorithm for converting LP into an equivalent LP in equational form
Basic feasible solutions (BFS)
Reading materials:
- - Sections 4.1 and 4.2 from [MG07]
  - Chapter 2 (except Section 2.5) from [LY08]

Topics covered in Lecture 6 (30 Oct.'21)

Revisiting the procedure for converting LP into an equivalent LP in equational form
Simple properties of BFS
Fundamental Theorem of Linear Programming (version 1)
Reading materials:
- - Sections 4.1 and 4.2 from [MG07]
  - Chapter 2 (except Section 2.5) from [LY08]

Topics covered in Lecture 7 (1 Nov.'21)

Finish the proof of Fundamental Theorem of Linear Programming (version 1)
Fundamental Theorem of Linear Programming (version 2)
Recall some useful definitions from discrete and convex geometry:
- - Convex sets and convex functions
  - Convex hulls
  - Polyhedrons and polytopes
  - Vertices and extreme points of convex sets
Reading materials:
- - Chapter 4 from [MG07]
  - Chapter 2 from [LY08]

Topics covered in Lecture 8 (11 Nov.'21)

Equivalence of BFS, extreme points, and vertices of LPs in equational form
Some useful definitions and simple observations about
- - Convex combinations
  - Affine combinations and affine hull
  - Affine dependence
  - Affine space
Reading materials:
- - Chapter 4 from [MG07]
  - Chapter 2 from [LY08]
  - Chapter 1 from [K06]
  - Chapter 1 from [M02]

Topics covered in Lecture 9 (13 Nov.'21)

Equivalence between two different definitions of convex hulls (and affine hulls)
Radon's Theorem
Helly's Theorem (finite version)
Statement of Colorful Helly Theorem (and its stronger version)
Statement of the infinite version of Helly's Theorem
Reading materials:
- - Chapter 1 from [K06]
  - Chapter 1 from [M02]
Additional reading materials:
- - http://www.math.utah.edu/~treiberg/HellySlides.pdf
  - Nice survey on Helly-Type Theorems [ALS15]

Topics covered in Lecture 10 (15 Nov.'21)

Discussion on the infinite version of Helly's Theorem
Definition of Centerpoint, and Rado's Theorem (showing the existence of a centerpoint)
Carathéodory's Theorem about convex hulls
Reading materials:
- - Chapter 1 from [K06]
  - Chapter 1 from [M02]
Additional reading materials:
- - http://www.math.utah.edu/~treiberg/HellySlides.pdf
  - Nice survey on Helly-Type Theorems [ALS15]

Topics covered in Lecture 11 (18 Nov.'21)

Hyperplane Separation Theorem (point and convex body case)
Reading materials:
- - Chapter 2 from [BV04]
  - Chapter 2 from [BT97]

Topics covered in Lecture 12 (20 Nov.'21)

Introduction to Strong Alternatives
Gaussian Elimination through the lens of strong alternatives
Definitions and simple properties of Cones and Conic Hulls
Farkas Lemma (Variant 1) from Hyperplane Separation Theorem (point and convex body case)
Reading materials:
- - Chapter 6 (except Sections 6.3, 6.6, and 6.7) from [MG07]
  - Chapter 4 (except Sections 4.5, 4.6, and 4.7) from [LY08]
  - Chapter 4 from [BT97]

Topics covered in Lecture 13 (22 Nov.'21)

Recall some properties of conic hulls and their proofs
Linear Programming (LP) Duality
Weak Duality Theorem for LP
Equivalence between Strong Duality of LP and Farkas Lemma (Variant 1)
Reading materials:
- - Chapter 6 (except Sections 6.3, 6.6, and 6.7) from [MG07]
  - Chapter 4 (except Sections 4.5, 4.6, and 4.7) from [LY08]
  - Chapter 4 from [BT97]

Topics covered in Lecture 14 (25 Nov.'21)

Statement of Farkas Lemma (Variant 2)
Strong Duality Theorem for LP
Reading materials:
- - Chapter 6 (except Sections 6.3, 6.6, and 6.7) from [MG07]
  - Chapter 4 (except Sections 4.5, 4.6, and 4.7) from [LY08]
  - Chapter 4 from [BT97]

Topics covered in Lecture 15 (11 Dec.'21)

Discussed mid-sem question paper

Topics covered in Lecture 16 (11 Dec.'21)

Complementary slackness and optimality
Clark's Theorem via Farkas Lemma
Reading materials:
- - Chapter 6 (except Section 6.6) from [MG07]
  - Chapter 4 (except Sections 4.5, 4.6, and 4.7) from [LY08]
  - Chapter 4 from [BT97]

Topics covered in Lectures 17 and 18 (13 Dec.'21)

Introduction to Simplex Method via an example
Simplex method: handling unboundedness
Simplex method: handling "degeneracy" and "cycling"
Optimality conditions for LP in equational form
Reading materials:
- - Chapter 5 from [MG07]
  - Chapter 3 from [LY08]
  - Chapter 3 from [BT97]

Topics covered in Lecture 19 (16 Dec.'21)

The general mathematical framework behind the Simplex Method
Reading materials:
- - Chapter 5 from [MG07]
  - Chapter 3 from [LY08]
  - Chapter 3 from [BT97]

Topics covered in Lecture 20 (20 Dec.'21)

Recap Simplex Method
Proof of correctness of the Simplex Method
Reading materials:
- - Chapter 5 from [MG07]
  - Chapter 3 from [LY08]
  - Chapter 3 from [BT97]

Topics covered in Lecture 21 (27 Dec.'21)

Finding a BFS using Simplex Method
Time complexity to perform a single pivot step
Different pivoting rules
Reading materials:
- - Chapter 5 from [MG07]
  - Chapter 3 from [LY08]
  - Chapter 3 from [BT97]

Topics covered in Lecture 22 (6 Jan.'22)

Bland's rule for selecting a pivot in the Simplex Method
Necessary and sufficient criteria for the existence of a vertex in a polyhedra
Fundamental Theorem of Linear Programming (version 3)
Relations between different variants of the Fundamental Theorem of Linear Programming
Relation between Matching and Vertex Cover in general graphs
König's Theorem for bipartite graphs (only statement for now)
Reading materials:
- - Chapter 5 from [MG07]
  - Chapter 3 from [LY08]
  - Chapter 3 from [BT97]
  - Chapter 4 from [BT97]

Topics covered in Lecture 23 (8 Jan.'22)

Integral Polyhedra
Totally Unimodular Matrix (TUM), and its connection to Integral Polyhedra
Fractional Vertex Cover and Maximum Matching
Reading materials:
- - Section 8.2 from [MG07]
  - Wikipedia article on König's Theorem

Topics covered in Lecture 24 (8 Jan.'22)

Proof of König's Theorem via Fundamental Theorem of Linear Programming and TUM
Lagrangian Multipliers
Dual function and its properties
Weak duality
Reading materials:
- - Section 8.2 from [MG07]
  - Wikipedia article on König's Theorem
  - Chapter 5 from [BV04]
  - Chapter 5 from [V21]

Topics covered in Lecture 25 (10 Jan.'22)

Lagrangian dual program
Complementary slackness
Duality gap, and KKT conditions
Reading materials:
- - Chapter 5 from [BV04]
  - Chapter 5 from [V21]

Topics covered in Lecture 26 (10 Jan.'22)

Statement of the first-order characterization of convex functions
KKT conditions and strong duality of convex programs
Statement of hyperplane separation of convex sets
Statement of Slater's Theorem on the strong duality of convex programs
Reading materials:
- - Chapter 5 from [BV04]
  - Chapter 5 from [V21]
  - Chapter 2 from [BV04] (except Sections 2.4 and 2.6)
  - Sections 3.1 and 3.2 in Chapter 3 from [BV04]

Topics covered in Lecture 27 (14 Jan.'22)

Recall first-order characterization of convex functions
Statement of second-order characterization of convex functions
Proof of Slater's Theorem
Reading materials:
- - Chapter 5 from [BV04]
  - Chapter 5 from [V21]

Topics covered in Lecture 28 (15 Jan.'22)

Basic properties of convex functions
First-order and second-order characterizations of strictly and strongly convex functions
Characterization of optimal solutions to convex programs
Convex programs with affine constraint
Introduction to Descent Methods
Reading materials:
- - Sections 3.1 and 3.2 in Chapter 3 from [BV04]
  - Chapter 4 from [BV04]
  - Section 9.2 from [BV04]
Additional reading materials:
- - Chapter 9 from [BV04]

Topics covered in Lecture 29 (17 Jan.'22)

Introduction to Gradient Descent Method
Step size and stopping criteria
L-Lipschitz function and its properties
Projected Gradient Descent Method for constrained programs
Convergence rate of gradient descent method for L-Lipschitz objective functions
Reading materials:
- - Section 3.1 from [B15]
  - Sections 9.1, 9.2, and 9.3 from [BV04]
Additional reading materials:
- - Chapter 9 from [BV04]
  - Chapter 3 from [B15]

Topics covered in Lecture 30 (19 Jan.'22)

β-smooth functions and its properties
α-strongly convex functions and its properties
Convergence rate of gradient descent method for β-smooth convex functions
Reading materials:
- - Section 3.2 from [B15]
  - Section 9.3 from [BV04]
Additional reading materials:
- - Chapter 9 from [BV04]
  - Chapter 3 from [B15]

Topics covered in Lecture 31 (20 Jan.'22)

Recall properties of β-smooth functions and α-strongly convex functions
Convergence rate of gradient descent method for β-smooth and α-strongly convex functions
Introduction to Conditional Gradient Descent Method for constrained programs
Reading materials:
- - Sections 3.3 and 3.4 from [B15]
  - Section 9.3 from [BV04]
Additional reading materials:
- - Chapter 9 from [BV04]
  - Chapter 3 from [B15]

Topics covered in Lecture 32 (23 Jan.'22)

Further discussion on the convergence rate of gradient descent method for β-smooth and α-strongly convex functions
Convergence rate of conditional gradient descent method for constrained programs with β-smooth convex objective function
Reading materials:
- - Sections 3.3 and 3.4 from [B15]
  - Sections 9.3 from [BV04]
Additional reading materials:
- - Chapter 9 from [BV04]
  - Chapter 3 from [B15]

Topics covered in Lecture 33 (23 Jan.'22)

Separating oracle for convex sets
Computational cost of implementing separating oracles for LPs
Ellipsoid algorithm and its convergence rate
Reading materials:
- - Section 2.2 from [B15]
Additional reading materials:
- - Section 7.1 from [MG07]
  - Chapter 8 from [BT97]
  - Section 5.3 from [LY08]