Optimization Techniques (M.Tech CS, 2021-22)

Instructor Arijit Ghosh

Status Completed (Final lecture was on 23 Jan 2022)

Description To study the basics of Mathematical Programming

Prerequisites Mathematical maturity of a finishing undergraduate student in mathematical sciences

Class timings Mondays 9:25 - 11:10 hrs and Thursdays 11:15 - 13:00 hrs

Syllabus

  • Introduction to Mathematical Programming

  • Linear Programming (LP), Integer Linear Programming (ILP), and Semidefinite Programming (SDP)

  • Basic properties of LP

  • Basics of discrete convex geometry (Helly's Theorem and its relatives) and its applications

  • Vertices and extreme points of convex sets

  • Structural results on polytopes and polyhedra

  • Strong (and weak) duality of Linear Programming and its applications

  • Unimodular Matrix

  • Applications of LP

  • Support Vector Machine and Classification Problems

  • The Simplex Method

  • Perceptron Algorithm (not covered due to lack of time)

  • Fourier–Motzkin Elimination Method (not covered due to lack of time)

  • A brief introduction to Complexity Theory: NP-completeness, NP-hardness, Approximation algorithms, Hardness of approximation, etc.

  • Rounding LP relaxations of ILP to get integral solutions

  • Lagrangian dual function and Lagrangian dual problem

  • Basics of convex geometry and convex functions

  • Convex Programming

  • Descent Methods

  • Gradient Descent Method

  • Projected Gradient Descent Method

  • Conditional Gradient Descent Method

  • Center of Gravity Method (not covered due to lack of time)

  • The Ellipsoid Method

  • Interior Point Method (not covered due to lack of time)

  • Introduction to Conic programming (not covered due to lack of time)

  • SDP and its applications (not covered due to lack of time)

  • Submodular Optimization (not covered due to lack of time)

  • Multiplicative Weight Update method (not covered due to lack of time)

  • Online convex optimization (not covered due to lack of time)

  • Mirror Descent (not covered due to lack of time)

  • Stochastic Gradient Descent (not covered due to lack of time)

References

  • [AS16] Noga Alon and Joel Spencer, The Probabilistic Method, 4th Edition, Wiley, 2016

  • [ALS15] Nina Amenta, Jesús A. De Loera, and Pablo Soberón, Helly's Theorem: New Variations and Applications, 2015

  • [BO97] Imre Barany and Shmuel Onn, Caratheodory's Theorem: Colourful and Applicable, Bolyai Society Mathematical Studies, 6:11 - 21, 1997

  • [B10] Alexander Barvinok, Combinatorics of Polytopes, Lecture Notes, Univ. Michigan, 2010

  • [BT97] Dimitris Bertsimas and John N. Tsitsiklis, Introduction to Linear Optimization, Athena Scientific, 1997

  • [BV04] Stephen Boyd and Lieven Vandenberghe, Convex Optimization, Cambridge University Press, 2004

  • [B15] Sébastien Bubeck, Convex Optimization: Algorithms and Complexity, Foundations and Trends in Machine Learning, 2015

  • [FV14] Robert M. Freund and Jorge Vera, Fundamentals of Nonlinear Optimization: A Constructive Approach, book in progress

  • [H18] Moritz Hardt, Convex Optimization and Approximation, Lecture Notes, UC Berkeley, 2018

  • [K06] Vladlen Koltun, Advanced Geometric Algorithms, Lecture Notes, Stanford University, 2006

  • [LGM19] Jesus de Loera, Xavier Goaoc, Frederic Meunier and Nabil Mustafa, The Discrete Yet Ubiquitous Theorems of Caratheodory, Helly, Sperner, Tucker and Tverberg, Bulletin of the American Mathematical Society, 56: 415-511, 2019

  • [LY08] David G. Luenberger and Yinyu Ye, Linear and Nonlinear Programming, 3rd Edition, Springer, 2008

  • [MG07] Jiří Matoušek and Bernd Gärtner, Understanding and Using Linear Programming, Springer, 2007

  • [M02] Jiří Matoušek, Lectures on Discrete Geometry, Springer, 2002

  • [R97] R. Tyrrell Rockafellar, Convex Analysis, Princeton University Press, 1997

  • [S93] Rolf Schneider, Convex Bodies: The Brunn-Minkowski Theory, Cambridge University Press, 1993

  • [SB18] Shai Shalev-Shwartz and Shai Ben-David, Understanding Machine Learning: From Theory to Algorithms, Cambridge Univ. Press, 2018

  • [MU05] Michael Mitzenmacher and Eli Upfal, Probability and Computing, Cambridge University Press, 2005

  • [WS11] David P. Williamson and David B. Shmoys, The Design of Approximation Algorithms, Cambridge University Press, 2011

  • [TB97] Lloyd N. Trefethen and David Bau III, Numerical Linear Algebra, SIAM, 1997

  • [V21] Nisheeth K. Vishnoi, Algorithms for Convex Optimization, Cambridge University Press, 2021

  • [V03] Vijay Vazirani, Approximation Algorithms, Springer, 2003

  • [Y18] Yurii Nesterov, Lectures on Convex Optimization, Springer, 2018

  • [Z95] Günter M. Ziegler, Lectures on Polytopes, Springer-Verlag, 1995

Lecture details

Topics covered in Lecture 1 (4 Oct.'21)

  • Course introduction

  • Introduction to Mathematical Programming

  • Introduction to Unconstrained Optimization

  • Some applications of Unconstrained Optimization:

      • Facility Location problem and Geometric Median problem

      • Minimum Enclosing Ball problem

      • Least Squares problem

  • Introduction to Linear Programs (LP) and Integer Linear Programs (ILP)

  • Some definitions used in LPs were discussed:

      • Hyperplanes and halfspaces

      • Convex sets

  • Reading materials:

      • Chapter 1 from [BV04]

      • Chapter 1 from [LY08]

      • Chapter 1 from [MG07]

Topics covered in Lecture 2 (7 Oct.'21)

  • Recall some definitions related to Mathematical Programming

  • Feasible set, feasible solution, and feasibility question

  • Equivalence between solving LP and LP feasibility question

  • Reducing feasibility question to Unconstrained Optimization question

  • Vertex Cover (VC) problem

  • Maximal matching, and a greedy algorithm to construct a maximal matching

  • 2-factor approximation to VC problem using maximal matching

  • Reading materials:

      • Chapter 1 from [BV04]

      • Chapter 1 from [LY08]

      • Chapter 1 from [V03]

Topics covered in Lecture 3 (18 Oct.'21)

  • ILP formulation for Vertex Cover (VC) problem

  • LP relaxation to the above ILP

  • An alternative 2-factor approximation algorithm to VC via LP relaxation and rounding

  • Introduction to Convex Programming

  • Reading materials:

      • Chapter 1 from [LY08]

      • Chapter 1 from [MG07]

      • Section 2.1 from [V03]

      • Section 3.3 from [MG07]

Topics covered in Lecture 4 (25 Oct.'21)

  • Linear Regression

  • Linear Regression with penalties

  • Support Vector Machine (SVM)

  • LP formulation of Linear Classifier

  • Robust Linear Classifier and its different programming formulations

  • Reading materials:

Topics covered in Lecture 5 (28 Oct.'21)

  • Revisiting Robust Linear Classifier

  • Equational form of LP

  • Procedure/algorithm for converting LP into an equivalent LP in equational form

  • Basic feasible solutions (BFS)

  • Reading materials:

      • Sections 4.1 and 4.2 from [MG07]

      • Chapter 2 (except Section 2.5) from [LY08]

Topics covered in Lecture 6 (30 Oct.'21)

  • Revisiting the procedure for converting LP into an equivalent LP in equational form

  • Simple properties of BFS

  • Fundamental Theorem of Linear Programming (version 1)

  • Reading materials:

      • Sections 4.1 and 4.2 from [MG07]

      • Chapter 2 (except Section 2.5) from [LY08]

Topics covered in Lecture 7 (1 Nov.'21)

  • Finish the proof of Fundamental Theorem of Linear Programming (version 1)

  • Fundamental Theorem of Linear Programming (version 2)

  • Recall some useful definitions from discrete and convex geometry:

      • Convex sets and convex functions

      • Convex hulls

      • Polyhedrons and polytopes

      • Vertices and extreme points of convex sets

  • Reading materials:

      • Chapter 4 from [MG07]

      • Chapter 2 from [LY08]

Topics covered in Lecture 8 (11 Nov.'21)

  • Equivalence of BFS, extreme points, and vertices of LPs in equational form

  • Some useful definitions and simple observations about

      • Convex combinations

      • Affine combinations and affine hull

      • Affine dependence

      • Affine space

  • Reading materials:

      • Chapter 4 from [MG07]

      • Chapter 2 from [LY08]

      • Chapter 1 from [K06]

      • Chapter 1 from [M02]

Topics covered in Lecture 9 (13 Nov.'21)

  • Equivalence between two different definitions of convex hulls (and affine hulls)

  • Radon's Theorem

  • Helly's Theorem (finite version)

  • Statement of Colorful Helly Theorem (and its stronger version)

  • Statement of the infinite version of Helly's Theorem

  • Reading materials:

      • Chapter 1 from [K06]

      • Chapter 1 from [M02]

  • Additional reading materials:

Topics covered in Lecture 10 (15 Nov.'21)

  • Discussion on the infinite version of Helly's Theorem

  • Definition of Centerpoint, and Rado's Theorem (showing the existence of a centerpoint)

  • Carathéodory's Theorem about convex hulls

  • Reading materials:

      • Chapter 1 from [K06]

      • Chapter 1 from [M02]

  • Additional reading materials:

Topics covered in Lecture 11 (18 Nov.'21)

  • Hyperplane Separation Theorem (point and convex body case)

  • Reading materials:

      • Chapter 2 from [BV04]

      • Chapter 2 from [BT97]

Topics covered in Lecture 12 (20 Nov.'21)

  • Introduction to Strong Alternatives

  • Gaussian Elimination through the lens of strong alternatives

  • Definitions and simple properties of Cones and Conic Hulls

  • Farkas Lemma (Variant 1) from Hyperplane Separation Theorem (point and convex body case)

  • Reading materials:

      • Chapter 6 (except Sections 6.3, 6.6, and 6.7) from [MG07]

      • Chapter 4 (except Sections 4.5, 4.6, and 4.7) from [LY08]

      • Chapter 4 from [BT97]

Topics covered in Lecture 13 (22 Nov.'21)

  • Recall some properties of conic hulls and their proofs

  • Linear Programming (LP) Duality

  • Weak Duality Theorem for LP

  • Equivalence between Strong Duality of LP and Farkas Lemma (Variant 1)

  • Reading materials:

      • Chapter 6 (except Sections 6.3, 6.6, and 6.7) from [MG07]

      • Chapter 4 (except Sections 4.5, 4.6, and 4.7) from [LY08]

      • Chapter 4 from [BT97]

Topics covered in Lecture 14 (25 Nov.'21)

  • Statement of Farkas Lemma (Variant 2)

  • Strong Duality Theorem for LP

  • Reading materials:

      • Chapter 6 (except Sections 6.3, 6.6, and 6.7) from [MG07]

      • Chapter 4 (except Sections 4.5, 4.6, and 4.7) from [LY08]

      • Chapter 4 from [BT97]

Topics covered in Lecture 15 (11 Dec.'21)

  • Discussed mid-sem question paper

Topics covered in Lecture 16 (11 Dec.'21)

  • Complementary slackness and optimality

  • Clark's Theorem via Farkas Lemma

  • Reading materials:

      • Chapter 6 (except Section 6.6) from [MG07]

      • Chapter 4 (except Sections 4.5, 4.6, and 4.7) from [LY08]

      • Chapter 4 from [BT97]

Topics covered in Lectures 17 and 18 (13 Dec.'21)

  • Introduction to Simplex Method via an example

  • Simplex method: handling unboundedness

  • Simplex method: handling "degeneracy" and "cycling"

  • Optimality conditions for LP in equational form

  • Reading materials:

      • Chapter 5 from [MG07]

      • Chapter 3 from [LY08]

      • Chapter 3 from [BT97]

Topics covered in Lecture 19 (16 Dec.'21)

  • The general mathematical framework behind the Simplex Method

  • Reading materials:

      • Chapter 5 from [MG07]

      • Chapter 3 from [LY08]

      • Chapter 3 from [BT97]

Topics covered in Lecture 20 (20 Dec.'21)

  • Recap Simplex Method

  • Proof of correctness of the Simplex Method

  • Reading materials:

      • Chapter 5 from [MG07]

      • Chapter 3 from [LY08]

      • Chapter 3 from [BT97]

Topics covered in Lecture 21 (27 Dec.'21)

  • Finding a BFS using Simplex Method

  • Time complexity to perform a single pivot step

  • Different pivoting rules

  • Reading materials:

      • Chapter 5 from [MG07]

      • Chapter 3 from [LY08]

      • Chapter 3 from [BT97]

Topics covered in Lecture 22 (6 Jan.'22)

  • Bland's rule for selecting a pivot in the Simplex Method

  • Necessary and sufficient criteria for the existence of a vertex in a polyhedra

  • Fundamental Theorem of Linear Programming (version 3)

  • Relations between different variants of the Fundamental Theorem of Linear Programming

  • Relation between Matching and Vertex Cover in general graphs

  • König's Theorem for bipartite graphs (only statement for now)

  • Reading materials:

      • Chapter 5 from [MG07]

      • Chapter 3 from [LY08]

      • Chapter 3 from [BT97]

      • Chapter 4 from [BT97]

Topics covered in Lecture 23 (8 Jan.'22)

  • Integral Polyhedra

  • Totally Unimodular Matrix (TUM), and its connection to Integral Polyhedra

  • Fractional Vertex Cover and Maximum Matching

  • Reading materials:

      • Section 8.2 from [MG07]

      • Wikipedia article on König's Theorem

Topics covered in Lecture 24 (8 Jan.'22)

  • Proof of König's Theorem via Fundamental Theorem of Linear Programming and TUM

  • Lagrangian Multipliers

  • Dual function and its properties

  • Weak duality

  • Reading materials:

      • Section 8.2 from [MG07]

      • Wikipedia article on König's Theorem

      • Chapter 5 from [BV04]

      • Chapter 5 from [V21]

Topics covered in Lecture 25 (10 Jan.'22)

  • Lagrangian dual program

  • Complementary slackness

  • Duality gap, and KKT conditions

  • Reading materials:

      • Chapter 5 from [BV04]

      • Chapter 5 from [V21]

Topics covered in Lecture 26 (10 Jan.'22)

  • Statement of the first-order characterization of convex functions

  • KKT conditions and strong duality of convex programs

  • Statement of hyperplane separation of convex sets

  • Statement of Slater's Theorem on the strong duality of convex programs

  • Reading materials:

      • Chapter 5 from [BV04]

      • Chapter 5 from [V21]

      • Chapter 2 from [BV04] (except Sections 2.4 and 2.6)

      • Sections 3.1 and 3.2 in Chapter 3 from [BV04]

Topics covered in Lecture 27 (14 Jan.'22)

  • Recall first-order characterization of convex functions

  • Statement of second-order characterization of convex functions

  • Proof of Slater's Theorem

  • Reading materials:

      • Chapter 5 from [BV04]

      • Chapter 5 from [V21]

Topics covered in Lecture 28 (15 Jan.'22)

  • Basic properties of convex functions

  • First-order and second-order characterizations of strictly and strongly convex functions

  • Characterization of optimal solutions to convex programs

  • Convex programs with affine constraint

  • Introduction to Descent Methods

  • Reading materials:

      • Sections 3.1 and 3.2 in Chapter 3 from [BV04]

      • Chapter 4 from [BV04]

      • Section 9.2 from [BV04]

  • Additional reading materials:

      • Chapter 9 from [BV04]

Topics covered in Lecture 29 (17 Jan.'22)

  • Introduction to Gradient Descent Method

  • Step size and stopping criteria

  • L-Lipschitz function and its properties

  • Projected Gradient Descent Method for constrained programs

  • Convergence rate of gradient descent method for L-Lipschitz objective functions

  • Reading materials:

      • Section 3.1 from [B15]

      • Sections 9.1, 9.2, and 9.3 from [BV04]

  • Additional reading materials:

      • Chapter 9 from [BV04]

      • Chapter 3 from [B15]

Topics covered in Lecture 30 (19 Jan.'22)

  • β-smooth functions and its properties

  • α-strongly convex functions and its properties

  • Convergence rate of gradient descent method for β-smooth convex functions

  • Reading materials:

      • Section 3.2 from [B15]

      • Section 9.3 from [BV04]

  • Additional reading materials:

      • Chapter 9 from [BV04]

      • Chapter 3 from [B15]

Topics covered in Lecture 31 (20 Jan.'22)

  • Recall properties of β-smooth functions and α-strongly convex functions

  • Convergence rate of gradient descent method for β-smooth and α-strongly convex functions

  • Introduction to Conditional Gradient Descent Method for constrained programs

  • Reading materials:

      • Sections 3.3 and 3.4 from [B15]

      • Section 9.3 from [BV04]

  • Additional reading materials:

      • Chapter 9 from [BV04]

      • Chapter 3 from [B15]

Topics covered in Lecture 32 (23 Jan.'22)

  • Further discussion on the convergence rate of gradient descent method for β-smooth and α-strongly convex functions

  • Convergence rate of conditional gradient descent method for constrained programs with β-smooth convex objective function

  • Reading materials:

      • Sections 3.3 and 3.4 from [B15]

      • Sections 9.3 from [BV04]

  • Additional reading materials:

      • Chapter 9 from [BV04]

      • Chapter 3 from [B15]

Topics covered in Lecture 33 (23 Jan.'22)

  • Separating oracle for convex sets

  • Computational cost of implementing separating oracles for LPs

  • Ellipsoid algorithm and its convergence rate

  • Reading materials:

      • Section 2.2 from [B15]

  • Additional reading materials:

      • Section 7.1 from [MG07]

      • Chapter 8 from [BT97]

      • Section 5.3 from [LY08]