Optimization Techniques (M.Stat, 2022)
Instructor Arijit Ghosh
Status Completed (Final lecture was on 19 May 2022)
Description To study the basics of Mathematical Programming
Prerequisites Mathematical maturity of a finishing undergraduate student in mathematical sciences
Class timings Mondays 10:30 - 12:25 hrs and Thursdays 10:30 - 12:25 hrs
Syllabus
Introduction to Mathematical Programming
Linear Programming (LP), Integer Linear Programming (ILP), and Semidefinite Programming (SDP)
Basic properties of LP
Basics of discrete convex geometry (Helly's Theorem and its relatives) and its applications
Vertices and extreme points of convex sets
Structural results on polytopes and polyhedra
Strong (and weak) duality of Linear Programming and its applications
Unimodular Matrix
Applications of LP
Support Vector Machine and Classification Problems
The Simplex Method
Perceptron Algorithm
Fourier–Motzkin Elimination Method
A brief introduction to Complexity Theory: NP-completeness, NP-hardness, Approximation algorithms, Hardness of approximation, etc.
Rounding LP relaxations of ILP to get integral solutions
Lagrangian dual function and Lagrangian dual problem
Basics of convex geometry and convex functions
Convex Programming
Descent Methods
Gradient Descent Method
Projected Gradient Descent Method
Conditional Gradient Descent Method
Center of Gravity Method
The Ellipsoid Method
Interior Point Method
Introduction to Conic programming (if time permits)
SDP and its applications (if time permits)
Submodular Optimization (if time permits)
Multiplicative Weight Update method (if time permits)
Online convex optimization (if time permits)
Mirror Descent (if time permits)
Stochastic Gradient Descent (if time permits)
References
[AS16] Noga Alon and Joel Spencer, The Probabilistic Method, 4th Edition, Wiley, 2016
[ALS15] Nina Amenta, Jesús A. De Loera, and Pablo Soberón, Helly's Theorem: New Variations and Applications, 2015
[BO97] Imre Barany and Shmuel Onn, Caratheodory's Theorem: Colourful and Applicable, Bolyai Society Mathematical Studies, 6:11 - 21, 1997
[B10] Alexander Barvinok, Combinatorics of Polytopes, Lecture Notes, Univ. Michigan, 2010
[BT97] Dimitris Bertsimas and John N. Tsitsiklis, Introduction to Linear Optimization, Athena Scientific, 1997
[BV04] Stephen Boyd and Lieven Vandenberghe, Convex Optimization, Cambridge University Press, 2004
[B15] Sébastien Bubeck, Convex Optimization: Algorithms and Complexity, Foundations and Trends in Machine Learning, 2015
[F09] Friedrich Eisenbrand, Introduction to Discrete Optimization, Lecture Notes, EPFL, 2009
[FV14] Robert M. Freund and Jorge Vera, Fundamentals of Nonlinear Optimization: A Constructive Approach, book in progress
[H18] Moritz Hardt, Convex Optimization and Approximation, Lecture Notes, UC Berkeley, 2018
[K06] Vladlen Koltun, Advanced Geometric Algorithms, Lecture Notes, Stanford University, 2006
[LGM19] Jesus de Loera, Xavier Goaoc, Frederic Meunier and Nabil Mustafa, The Discrete Yet Ubiquitous Theorems of Caratheodory, Helly, Sperner, Tucker and Tverberg, Bulletin of the American Mathematical Society, 56: 415-511, 2019
[LY08] David G. Luenberger and Yinyu Ye, Linear and Nonlinear Programming, 3rd Edition, Springer, 2008
[Mpi-10] Nicole Megow, Kurt Mehlhorn and Julián Mestre, Optimization, Lecture Notes, Max Planck Institute for Informatics, 2010.
[MG07] Jiří Matoušek and Bernd Gärtner, Understanding and Using Linear Programming, Springer, 2007
[M02] Jiří Matoušek, Lectures on Discrete Geometry, Springer, 2002
[R97] R. Tyrrell Rockafellar, Convex Analysis, Princeton University Press, 1997
[R20] Thomas Rothvoss, Discrete Optimization, Lecture Notes, University of Washington, 2020
[S93] Rolf Schneider, Convex Bodies: The Brunn-Minkowski Theory, Cambridge University Press, 1993
[SB18] Shai Shalev-Shwartz and Shai Ben-David, Understanding Machine Learning: From Theory to Algorithms, Cambridge Univ. Press, 2018
[MU05] Michael Mitzenmacher and Eli Upfal, Probability and Computing, Cambridge University Press, 2005
[WS11] David P. Williamson and David B. Shmoys, The Design of Approximation Algorithms, Cambridge University Press, 2011
[W14] David P. Williamson, Mathematical Programming, Lecture Notes, Cornell University, 2014.
[TB97] Lloyd N. Trefethen and David Bau III, Numerical Linear Algebra, SIAM, 1997
[V21] Nisheeth K. Vishnoi, Algorithms for Convex Optimization, Cambridge University Press, 2021
[V03] Vijay Vazirani, Approximation Algorithms, Springer, 2003
[Y18] Yurii Nesterov, Lectures on Convex Optimization, Springer, 2018
[Z95] Günter M. Ziegler, Lectures on Polytopes, Springer-Verlag, 1995
Lecture details
Topics covered in Lecture 1 (14 Feb.'22)
Course introduction
Introduction to Mathematical Programming
Definitions related to mathematical programming
Optimal value and optimal solution
Feasible solution and feasible region/set
Introduction to Unconstrained Optimization
Some applications of unconstrained optimization:
Facility Location problem and Geometric Median problem
Minimum Enclosing Ball problem
Least Squares problem
Introduction to Linear Programs (LP) and Integer Linear Programs (ILP)
Some definitions used in LPs were discussed:
Hyperplanes and halfspaces
LP relaxation to ILP
Vertex Cover (VC) problem
2-factor approximation to VC via LP relaxation to ILP
Reading materials:
Topics covered in Lecture 2 (17 Feb.'22)
Convex sets and convex functions
Sublevel sets of convex functions, and quasiconvex functions
Introduction to Convex Programs
Examples of convex programs
Linear Regression
Linear regression with penalties
Support Vector Machine (SVM)
LP formulation of Linear Classifier
Reading materials:
Topics covered in Lecture 3 (21 Feb.'22)
Robust Linear Classifier and its different programming formulations
Convex programs with a quadratic objective function
Non-convex programs with a linear objective function
Convex programs with a linear objective function
Lifting Technique and its application in
Support Vector Machine
Nonlinear Regression
General discussion on mathematical programs
Connection between feasibility and optimality questions
Feasibility via unconstraint optimization
Computational complexity
Reading materials:
Chapter 1 from [BV04]
Chapter 1 from [LY08]
Section 8.6 from [BV04]
Wikipedia entry on Linear Regression
Wikipedia entry on Nonlinear Regression
Wikipedia entry on SVM
Wikipedia entry on LASSO
Topics covered in Lecture 4 (23 Feb.'22)
LP in general and equational forms
Procedure/algorithm for converting LP into an equivalent LP in equational form
Basic feasible solutions (BFS)
Simple properties of BFS
Statement of the Fundamental Theorem of Linear Programming (Version 1: LP in equational form)
Reading materials:
Topics covered in Lecture 5 (24 Feb.'22)
Proof of Fundamental Theorem of Linear Programming
Vertices and Extreme Points of convex sets
Equivalence of BFS, extreme points, and vertices of the feasibility regions of LPs in equational form
Reading materials:
Topics covered in Lecture 6 (1 March'22)
Some useful definitions and simple observations from about
Convex combinations
Affine combinations, affine space and affine hull
Linear and affine dependence
Radon's Theorem
Reading materials:
Topics covered in Lecture 7 (5 March'22)
Helly's Theorem (finite version) for convex sets
Carathéodory's Theorem about convex hulls
Connection between LP and the above two theorems
Proving Carathéodory's Theorem using the Fundamental Theorem for LP
Proving Helly's Theorem using the special case for halfspaces and using the fact that polytopes are also polyhedrons
Radon's Theorem
Reading materials:
Topics covered in Lecture 8 (7 March'22)
Finite intersection property, and compact sets
Infinite version of Helly's theorem for convex sets
Definition of the Center Point of a point set
Rado's Center Point Theorem
Hyperplane Supporting and Separation Theorems
Reading materials:
Chapter 1 from [K06]
Chapter 1 from [M02]
Wikipedia entry on Finite intersection property
Wikipedia entry on Separation Theorems
Topics covered in Lecture 9 (9 March'22)
Fundamental Theorem of Linear Programs (versions 2 and 3)
Existence of optimal solutions of Linear Programs
Definitions and simple properties of Cones and Conic Hulls
Gaussian Elimination through the lens of strong alternatives
Strong Alternatives and Farkas Lemma
Proof of Farkas Lemma (version 1) from Hyperplane Separation Theorem (point and closed convex body case)
Reading materials:
Chapter 6 (except Sections 6.3, 6.6, and 6.7) from [MG07]
Chapter 4 (except Sections 4.5, 4.6, and 4.7) from [LY08]
Chapter 4 from [BT97]
Wikipedia entry on Separation Theorems
Topics covered in Lecture 10 (10 March'22)
Linear Programming (LP) Duality
Weak Duality Theorem for LP
Equivalence between Strong Duality of LP and Farkas Lemma (version 1)
Complementary slackness condition for LP
Statement of Farkas Lemma (version 2)
Strong Duality Theorem for LP via Farkas Lemma (version 2)
Reading materials:
Topics covered in Lecture 11 (23 March'22)
Clark's Theorem via Farkas Lemma
Equivalence between BFS, extreme points, and vertices of polyhedra
Basic definitions and properties of polyhedra:
Face of a polyhedron
Extreme points of a face
Existence of interior points in a polyhedron with full affine dimension
Existence of a ray and unbounded polyhedron
Bounded polyhedra are convex hulls of their extreme points
Fundamental Theorem of LP with a bounded feasible region
Reading materials:
Additional reading materials:
Chapters 1, 2, and 3 from [Z95]
Topics covered in Lecture 12 (24 March'22)
Recall the proof of ``bounded polyhedra are convex hulls of their extreme points''
Fourier-Motzkin's Elimination Technique for solving LP
Linear transformations of Polyhedra and Fourier-Motzkin's technique
Basic definitions:
Polyhedral cones
Recession cones and its equivalent definition
Minkowski Sum
Minkowski sum of two polyhedrons is also a polyhedron
Representation of Polyhedra (Weyl-Minkowski's Theorem)
Case of pointed polyhedra
General polyhedra
Reading materials:
Additional reading materials:
Chapters 1, 2, and 3 from [Z95]
Topics covered in Lecture 13 (4 April'22)
Recall complementary slackness condition for LP
Degenerate BFS
Optimality conditions for LP in equational form
Introduction to Simplex Method via an example
Simplex method: handling unboundedness
Simplex method: handling degeneracy and cycling
Mathematical framework behind the simplex method
Reading materials:
Topics covered in Lecture 14 (6 April'22)
Mathematical framework behind the simplex method
Correctness of simplex method
Pivoting rules
Reading materials:
Topics covered in Lecture 15 (11 April'22)
Removing degeneracy by adding noise
Termination of simplex method using Bland's rule
Klee-Minty cube and convergence of simplex method
Finding starting BFS for simplex method using simplex method
Revisting ILP formulation of vertex cover (VC) and its LP relaxation (fractional VC)
Integrality gap of vertex cover
Extreme points of the feasibility region of fractional VC
Reading materials:
Topics covered in Lecture 16 (20 April'22)
Integral polyhedra
Totally Unimodular Matrix (TUM), and its properties
Relation between vertex cover (VC), fractional VC, fractional Maximum Mathing (MM), and MM
König's Theorem and its proof using properties of LP
Extreme point structure of fractional VC
Reading materials:
Topics covered in Lecture 17 (21April'22)
Graphs and Cut metrics
LP approach to solving Minimum s-t Cut via cut metrics and randomized rounding
Lagrangian framework
Dual function and Dual program
Weak Duality
Reading materials:
Additional reading materials:
Chapters 8 and 15 from [WS11]
Topics covered in Lecture 18 (27 April'22)
Lagrangian framework and Complementary slackness condition
KKT conditions
Statements of first-order and second-order characterizations of convex functions
KKT conditions, and strong duality of convex programs
Properties of convex programs:
Characterization of optimal solutions to convex programs
Convex programs with affine constraints
Optimal solutions with non-zero gradient
Reading materials:
Topics covered in Lecture 19 (28 April'22)
Slater's condition for strong duality
Reading materials:
Chapter 5 from [BV04]
Topics covered in Lectures 20 (2 May'22)
Slater's weaker condition for strong duality
Support vector machines (SVM) and Farkas Lemma
Robust SVM and strong duality of convex problem
Reading materials:
Topics covered in Lecture 21 (2 May'22)
Continuation of robust SVM and strong duality of convex problems
Black-box framework for convergence analysis of convex programs
Center of gravity method for solving convex programs
Reading materials:
Topics covered in Lecture 22 (5 May'22)
Continuation of the center of gravity method for solving convex programs
Properties of Ellipsoids
Finding feasible solutions to convex programs using Ellipsoid Method
Reading materials:
Sections 2.1 and 2.2 in Chapter 2 from [B15]
Topics covered in Lecture 23 (9 May'22)
Properties of perturbed LP
Recalled the equivalence between LP feasibility and LP optimization questions
Polynomial time algorithm for solving LP using Ellipsoid method
Reading materials:
Topics covered in Lecture 24 (11 May'22)
Ellipsoid algorithm for convex programs, and its convergence rate
Projected Gradient Descent Method for constrained programs
L-Lipschitz function and its properties
Convergence rate of projected gradient descent method for L-Lipschitz objective functions
Reading materials:
Topics covered in Lecture 25 (12 May'22)
Introduction to General Descent Methods
Introduction to Gradient Descent Method
β-smooth functions and its properties
Convergence rate of gradient descent method for β-smooth convex functions
Reading materials:
Additional reading materials:
Topics covered in Lecture 26 (19 May'22)
α-strongly convex functions and its properties
Convergence rate of gradient descent method for β-smooth α-strongly convex functions
Introduction to Conditional Gradient Descent Method for constrained programs
Convergence rate of conditional gradient descent method for constrained programs with a β-smooth convex objective function
Reading materials:
Additional reading materials: