Optimization Techniques (M.Stat, 2022)

Instructor Arijit Ghosh

Status Completed (Final lecture was on 19 May 2022)

Description To study the basics of Mathematical Programming

Prerequisites Mathematical maturity of a finishing undergraduate student in mathematical sciences

Class timings Mondays 10:30 - 12:25 hrs and Thursdays 10:30 - 12:25 hrs

Syllabus

  • Introduction to Mathematical Programming

  • Linear Programming (LP), Integer Linear Programming (ILP), and Semidefinite Programming (SDP)

  • Basic properties of LP

  • Basics of discrete convex geometry (Helly's Theorem and its relatives) and its applications

  • Vertices and extreme points of convex sets

  • Structural results on polytopes and polyhedra

  • Strong (and weak) duality of Linear Programming and its applications

  • Unimodular Matrix

  • Applications of LP

  • Support Vector Machine and Classification Problems

  • The Simplex Method

  • Perceptron Algorithm

  • Fourier–Motzkin Elimination Method

  • A brief introduction to Complexity Theory: NP-completeness, NP-hardness, Approximation algorithms, Hardness of approximation, etc.

  • Rounding LP relaxations of ILP to get integral solutions

  • Lagrangian dual function and Lagrangian dual problem

  • Basics of convex geometry and convex functions

  • Convex Programming

  • Descent Methods

  • Gradient Descent Method

  • Projected Gradient Descent Method

  • Conditional Gradient Descent Method

  • Center of Gravity Method

  • The Ellipsoid Method

  • Interior Point Method

  • Introduction to Conic programming (if time permits)

  • SDP and its applications (if time permits)

  • Submodular Optimization (if time permits)

  • Multiplicative Weight Update method (if time permits)

  • Online convex optimization (if time permits)

  • Mirror Descent (if time permits)

  • Stochastic Gradient Descent (if time permits)

References

  • [AS16] Noga Alon and Joel Spencer, The Probabilistic Method, 4th Edition, Wiley, 2016

  • [ALS15] Nina Amenta, Jesús A. De Loera, and Pablo Soberón, Helly's Theorem: New Variations and Applications, 2015

  • [BO97] Imre Barany and Shmuel Onn, Caratheodory's Theorem: Colourful and Applicable, Bolyai Society Mathematical Studies, 6:11 - 21, 1997

  • [B10] Alexander Barvinok, Combinatorics of Polytopes, Lecture Notes, Univ. Michigan, 2010

  • [BT97] Dimitris Bertsimas and John N. Tsitsiklis, Introduction to Linear Optimization, Athena Scientific, 1997

  • [BV04] Stephen Boyd and Lieven Vandenberghe, Convex Optimization, Cambridge University Press, 2004

  • [B15] Sébastien Bubeck, Convex Optimization: Algorithms and Complexity, Foundations and Trends in Machine Learning, 2015

  • [F09] Friedrich Eisenbrand, Introduction to Discrete Optimization, Lecture Notes, EPFL, 2009

  • [FV14] Robert M. Freund and Jorge Vera, Fundamentals of Nonlinear Optimization: A Constructive Approach, book in progress

  • [H18] Moritz Hardt, Convex Optimization and Approximation, Lecture Notes, UC Berkeley, 2018

  • [K06] Vladlen Koltun, Advanced Geometric Algorithms, Lecture Notes, Stanford University, 2006

  • [LGM19] Jesus de Loera, Xavier Goaoc, Frederic Meunier and Nabil Mustafa, The Discrete Yet Ubiquitous Theorems of Caratheodory, Helly, Sperner, Tucker and Tverberg, Bulletin of the American Mathematical Society, 56: 415-511, 2019

  • [LY08] David G. Luenberger and Yinyu Ye, Linear and Nonlinear Programming, 3rd Edition, Springer, 2008

  • [Mpi-10] Nicole Megow, Kurt Mehlhorn and Julián Mestre, Optimization, Lecture Notes, Max Planck Institute for Informatics, 2010.

  • [MG07] Jiří Matoušek and Bernd Gärtner, Understanding and Using Linear Programming, Springer, 2007

  • [M02] Jiří Matoušek, Lectures on Discrete Geometry, Springer, 2002

  • [R97] R. Tyrrell Rockafellar, Convex Analysis, Princeton University Press, 1997

  • [R20] Thomas Rothvoss, Discrete Optimization, Lecture Notes, University of Washington, 2020

  • [S93] Rolf Schneider, Convex Bodies: The Brunn-Minkowski Theory, Cambridge University Press, 1993

  • [SB18] Shai Shalev-Shwartz and Shai Ben-David, Understanding Machine Learning: From Theory to Algorithms, Cambridge Univ. Press, 2018

  • [MU05] Michael Mitzenmacher and Eli Upfal, Probability and Computing, Cambridge University Press, 2005

  • [WS11] David P. Williamson and David B. Shmoys, The Design of Approximation Algorithms, Cambridge University Press, 2011

  • [W14] David P. Williamson, Mathematical Programming, Lecture Notes, Cornell University, 2014.

  • [TB97] Lloyd N. Trefethen and David Bau III, Numerical Linear Algebra, SIAM, 1997

  • [V21] Nisheeth K. Vishnoi, Algorithms for Convex Optimization, Cambridge University Press, 2021

  • [V03] Vijay Vazirani, Approximation Algorithms, Springer, 2003

  • [Y18] Yurii Nesterov, Lectures on Convex Optimization, Springer, 2018

  • [Z95] Günter M. Ziegler, Lectures on Polytopes, Springer-Verlag, 1995

Lecture details

Topics covered in Lecture 1 (14 Feb.'22)

  • Course introduction

  • Introduction to Mathematical Programming

  • Definitions related to mathematical programming

        • Optimal value and optimal solution

        • Feasible solution and feasible region/set

  • Introduction to Unconstrained Optimization

  • Some applications of unconstrained optimization:

        • Facility Location problem and Geometric Median problem

        • Minimum Enclosing Ball problem

        • Least Squares problem

  • Introduction to Linear Programs (LP) and Integer Linear Programs (ILP)

  • Some definitions used in LPs were discussed:

        • Hyperplanes and halfspaces

  • LP relaxation to ILP

  • Vertex Cover (VC) problem

  • 2-factor approximation to VC via LP relaxation to ILP

  • Reading materials:

        • Chapter 1 from [BV04]

        • Chapter 1 from [LY08]

        • Chapter 1 from [MG07]

        • Section 2.1 from [V03]

        • Section 3.3 from [MG07]

Topics covered in Lecture 2 (17 Feb.'22)

  • Convex sets and convex functions

  • Sublevel sets of convex functions, and quasiconvex functions

  • Introduction to Convex Programs

  • Examples of convex programs

  • Linear Regression

  • Linear regression with penalties

  • Support Vector Machine (SVM)

  • LP formulation of Linear Classifier

  • Reading materials:

Topics covered in Lecture 3 (21 Feb.'22)

  • Robust Linear Classifier and its different programming formulations

        • Convex programs with a quadratic objective function

        • Non-convex programs with a linear objective function

        • Convex programs with a linear objective function

  • Lifting Technique and its application in

        • Support Vector Machine

        • Nonlinear Regression

  • General discussion on mathematical programs

        • Connection between feasibility and optimality questions

        • Feasibility via unconstraint optimization

        • Computational complexity

  • Reading materials:

Topics covered in Lecture 4 (23 Feb.'22)

  • LP in general and equational forms

  • Procedure/algorithm for converting LP into an equivalent LP in equational form

  • Basic feasible solutions (BFS)

  • Simple properties of BFS

  • Statement of the Fundamental Theorem of Linear Programming (Version 1: LP in equational form)

  • Reading materials:

        • Sections 4.1 and 4.2 from [MG07]

        • Chapter 2 (except Section 2.5) from [LY08]

Topics covered in Lecture 5 (24 Feb.'22)

  • Proof of Fundamental Theorem of Linear Programming

  • Vertices and Extreme Points of convex sets

  • Equivalence of BFS, extreme points, and vertices of the feasibility regions of LPs in equational form

  • Reading materials:

        • Chapter 4 from [MG07]

        • Chapter 2 from [LY08]

Topics covered in Lecture 6 (1 March'22)

  • Some useful definitions and simple observations from about

        • Convex combinations

        • Affine combinations, affine space and affine hull

        • Linear and affine dependence

  • Radon's Theorem

  • Reading materials:

        • Chapter 1 from [K06]

        • Chapter 1 from [M02]

Topics covered in Lecture 7 (5 March'22)

  • Helly's Theorem (finite version) for convex sets

  • Carathéodory's Theorem about convex hulls

  • Connection between LP and the above two theorems

        • Proving Carathéodory's Theorem using the Fundamental Theorem for LP

        • Proving Helly's Theorem using the special case for halfspaces and using the fact that polytopes are also polyhedrons

  • Radon's Theorem

  • Reading materials:

        • Chapter 1 from [K06]

        • Chapter 1 from [M02]

Topics covered in Lecture 8 (7 March'22)

Topics covered in Lecture 9 (9 March'22)

  • Fundamental Theorem of Linear Programs (versions 2 and 3)

  • Existence of optimal solutions of Linear Programs

  • Definitions and simple properties of Cones and Conic Hulls

  • Gaussian Elimination through the lens of strong alternatives

  • Strong Alternatives and Farkas Lemma

  • Proof of Farkas Lemma (version 1) from Hyperplane Separation Theorem (point and closed convex body case)

  • Reading materials:

        • Chapter 6 (except Sections 6.3, 6.6, and 6.7) from [MG07]

        • Chapter 4 (except Sections 4.5, 4.6, and 4.7) from [LY08]

        • Chapter 4 from [BT97]

        • Wikipedia entry on Separation Theorems

Topics covered in Lecture 10 (10 March'22)

  • Linear Programming (LP) Duality

  • Weak Duality Theorem for LP

  • Equivalence between Strong Duality of LP and Farkas Lemma (version 1)

  • Complementary slackness condition for LP

  • Statement of Farkas Lemma (version 2)

  • Strong Duality Theorem for LP via Farkas Lemma (version 2)

  • Reading materials:

        • Chapter 6 (except Sections 6.3, 6.6, and 6.7) from [MG07]

        • Chapter 4 (except Sections 4.5, 4.6, and 4.7) from [LY08]

        • Chapter 4 from [BT97]

Topics covered in Lecture 11 (23 March'22)

  • Clark's Theorem via Farkas Lemma

  • Equivalence between BFS, extreme points, and vertices of polyhedra

  • Basic definitions and properties of polyhedra:

        • Face of a polyhedron

        • Extreme points of a face

        • Existence of interior points in a polyhedron with full affine dimension

        • Existence of a ray and unbounded polyhedron

  • Bounded polyhedra are convex hulls of their extreme points

  • Fundamental Theorem of LP with a bounded feasible region

  • Reading materials:

        • Chapters 2 and 4 from [BT97]

        • Chapter 9 from [B10]

  • Additional reading materials:

        • Chapters 1, 2, and 3 from [Z95]

Topics covered in Lecture 12 (24 March'22)

  • Recall the proof of ``bounded polyhedra are convex hulls of their extreme points''

  • Fourier-Motzkin's Elimination Technique for solving LP

  • Linear transformations of Polyhedra and Fourier-Motzkin's technique

  • Basic definitions:

        • Polyhedral cones

        • Recession cones and its equivalent definition

        • Minkowski Sum

        • Minkowski sum of two polyhedrons is also a polyhedron

  • Representation of Polyhedra (Weyl-Minkowski's Theorem)

        • Case of pointed polyhedra

        • General polyhedra

  • Reading materials:

        • Chapters 2 and 4 from [BT97]

        • Chapter 9 from [B10]

  • Additional reading materials:

        • Chapters 1, 2, and 3 from [Z95]

Topics covered in Lecture 13 (4 April'22)

  • Recall complementary slackness condition for LP

  • Degenerate BFS

  • Optimality conditions for LP in equational form

  • Introduction to Simplex Method via an example

  • Simplex method: handling unboundedness

  • Simplex method: handling degeneracy and cycling

  • Mathematical framework behind the simplex method

  • Reading materials:

        • Chapter 3 from [BT97]

        • Chapter 5 from [MG07]

        • Chapter 3 from [LY08]

Topics covered in Lecture 14 (6 April'22)

  • Mathematical framework behind the simplex method

  • Correctness of simplex method

  • Pivoting rules

  • Reading materials:

        • Chapter 3 from [BT97]

        • Chapter 5 from [MG07]

        • Chapter 3 from [LY08]

Topics covered in Lecture 15 (11 April'22)

  • Removing degeneracy by adding noise

  • Termination of simplex method using Bland's rule

  • Klee-Minty cube and convergence of simplex method

  • Finding starting BFS for simplex method using simplex method

  • Revisting ILP formulation of vertex cover (VC) and its LP relaxation (fractional VC)

  • Integrality gap of vertex cover

  • Extreme points of the feasibility region of fractional VC

  • Reading materials:

        • Chapter 3 from [BT97]

        • Chapter 5 from [MG07]

        • Chapter 3 from [LY08]

        • Section 3.3 from [MG07]

        • Chapter 1 from [V03]

Topics covered in Lecture 16 (20 April'22)

  • Integral polyhedra

  • Totally Unimodular Matrix (TUM), and its properties

  • Relation between vertex cover (VC), fractional VC, fractional Maximum Mathing (MM), and MM

  • König's Theorem and its proof using properties of LP

  • Extreme point structure of fractional VC

  • Reading materials:

        • Section 8.2 from [MG07]

        • Wikipedia article on König's Theorem

Topics covered in Lecture 17 (21April'22)

  • Graphs and Cut metrics

  • LP approach to solving Minimum s-t Cut via cut metrics and randomized rounding

  • Lagrangian framework

  • Dual function and Dual program

  • Weak Duality

  • Reading materials:

        • Chapter 5 from [BV04]

        • Section 8.1 from [WS11]

  • Additional reading materials:

        • Chapters 8 and 15 from [WS11]

Topics covered in Lecture 18 (27 April'22)

  • Lagrangian framework and Complementary slackness condition

  • KKT conditions

  • Statements of first-order and second-order characterizations of convex functions

  • KKT conditions, and strong duality of convex programs

  • Properties of convex programs:

        • Characterization of optimal solutions to convex programs

        • Convex programs with affine constraints

        • Optimal solutions with non-zero gradient

  • Reading materials:

        • Chapters 4 and 5 from [BV04]

        • Sections 3.1 and 3.2 in Chapter 3 from [BV04]

Topics covered in Lecture 19 (28 April'22)

  • Slater's condition for strong duality

  • Reading materials:

        • Chapter 5 from [BV04]

Topics covered in Lectures 20 (2 May'22)

  • Slater's weaker condition for strong duality

  • Support vector machines (SVM) and Farkas Lemma

  • Robust SVM and strong duality of convex problem

  • Reading materials:

        • Chapter 5 from [BV04]

        • Section 8.6 in Chapter 8 from [BV04]

Topics covered in Lecture 21 (2 May'22)

  • Continuation of robust SVM and strong duality of convex problems

  • Black-box framework for convergence analysis of convex programs

  • Center of gravity method for solving convex programs

  • Reading materials:

        • Chapter 5 from [BV04]

        • Section 8.6 in Chapter 8 from [BV04]

        • Section 2.1 in Chapter 2 from [B15]

Topics covered in Lecture 22 (5 May'22)

  • Continuation of the center of gravity method for solving convex programs

  • Properties of Ellipsoids

  • Finding feasible solutions to convex programs using Ellipsoid Method

  • Reading materials:

        • Sections 2.1 and 2.2 in Chapter 2 from [B15]

Topics covered in Lecture 23 (9 May'22)

  • Properties of perturbed LP

  • Recalled the equivalence between LP feasibility and LP optimization questions

  • Polynomial time algorithm for solving LP using Ellipsoid method

  • Reading materials:

        • Lecture 14 from [Mpi-10]

        • Lectures 19 and 20 from [W14]

        • Chapter 8 from [BT97]

Topics covered in Lecture 24 (11 May'22)

  • Ellipsoid algorithm for convex programs, and its convergence rate

  • Projected Gradient Descent Method for constrained programs

  • L-Lipschitz function and its properties

  • Convergence rate of projected gradient descent method for L-Lipschitz objective functions

  • Reading materials:

        • Section 2.2 from Chapter 2 in [B15]

        • Section 3.1 from Chapter 3 in [B15]

Topics covered in Lecture 25 (12 May'22)

  • Introduction to General Descent Methods

  • Introduction to Gradient Descent Method

  • β-smooth functions and its properties

  • Convergence rate of gradient descent method for β-smooth convex functions

  • Reading materials:

        • Section 3.2 from Chapter 3 in [B15]

        • Sections 9.1, 9.2, and 9.3 from Chapter 9 in [BV04]

  • Additional reading materials:

        • Chapter 3 from [B15]

        • Chapter 9 in [BV04]

Topics covered in Lecture 26 (19 May'22)

  • α-strongly convex functions and its properties

  • Convergence rate of gradient descent method for β-smooth α-strongly convex functions

  • Introduction to Conditional Gradient Descent Method for constrained programs

  • Convergence rate of conditional gradient descent method for constrained programs with a β-smooth convex objective function

  • Reading materials:

        • Sections 3.3 and 3.4 from [B15]

        • Section 9.3 from [BV04]

  • Additional reading materials:

        • Chapter 3 from [B15]

        • Chapter 9 in [BV04]