Nonlinear optimisation

July to Nov 2019

Important dates:

Quiz 1: Sep 3

Drop date: October 7

Quiz 2: October 15

Final : Nov 22

Mini-quizzes: As scheduled in class.

Schedule/Logistics:

Classes will be held in CS26 in the F-slot.

Class timings:

Tuesday: 4 -- 4:50 PM

Wednesday: 11 -- 11:50 AM

Thursday: 9 -- 8:50 AM

Friday: 8 -- 8:50 PM

Office hours:

Wednesday: 10 -- 11

Thursday: 10 -- 11

First day handout pdf.

Prerequisites:

1. Undergraduate multivariate calculus.

2. Linear algebra. (Students doing a such a course in parallel are also welcome.)

3. Basic Numpy/python programming. (Students should be able pick it up along the way. A tutorial on numpy/python can be arranged based on demand.)

The first few classes will mainly focus on a recap of the above topics, and here are some links below for further reading/watching/studying

1a. https://ocw.mit.edu/courses/mathematics/18-01sc-single-variable-calculus-fall-2010/unit-2-applications-of-differentiation/ (Part A)

1b. https://ocw.mit.edu/courses/mathematics/18-02sc-multivariable-calculus-fall-2010/2.-partial-derivatives/ (Part A and B)

2a. https://youtu.be/kjBOesZCoqc (Essence of linear algebra, youtube series)

2b. https://youtu.be/ZK3O402wf1c (Gilbert Strang course on Linear algebra.)

Evaluation:

Final: 30

Quiz 1: 20

Quiz 2: 20

Mini-quiz: 20 (Best 4 out of 6)

Project : 10

Reference Books:

[BV] Stephen Boyd, and Lieven Vandenberghe. Convex optimization. Cambridge university press, 2004. Link.

[LY] David G. Luenberger , and Yinyu Ye. Linear and nonlinear programming. 4th edition. Springer, 2015.

[NW] Jorge Nocedal and Wright, Stephen. Numerical optimization. Springer, 1999

[N] Yu Nesterov. Introductory lectures on convex programming. 1999. Link.

[B] Sebastien Bubeck. Convex Optimization: Algorithms and Complexity. Link.


Other reference material:

NPTEL course on numerical optimisation by Prof. Shevade: https://youtu.be/biwjg9tpOvM

Convex optimisation course by Prof. Stephen Boyd: https://youtu.be/McLq1hEq3UY

Convex optimisation course by Moritz Hardt. https://ee227c.github.io/

Miscellaneous Notes:

The course EE5121, intersects significantly with this course, and students who have taken that course cannot credit this course.

Class details:

  1. July 30. Introduction to optimisation.
  2. July 31. Sets, functions, metric spaces, continuity and differentiability of single variable functions.
  3. Aug 1. Lines, planes, tangent-lines and planes. Linear approximations for multi-variable functions.
  4. Aug 2. Interpretations for gradient.
  5. Aug 6: Subspace, affine-space, cones, and convex sets.
  6. Aug 7: Examples. Hulls.
  7. Aug 8: Halfspace and Vertex representations of polyhedra : cubes and octahedra.
  8. Aug 9: Mini-Quiz 1
  9. Aug 13: PSD cones, Second order cones, H and V representations of ellipsoids, Minkowski sum, Affine images and inverses.
  10. Aug 14: Ellipsoids contd., separating hyperplanes.
  11. Aug 20: Separating hyperplanes contd. Supporting hyperplanes and convex functions.
  12. Aug 21: First order characterisation of convex functions.
  13. Aug 22: Second order characterisation of convex functions. Examples.
  14. Aug 23: Addition, maximisation and composition of convex functions. Jensen's inequality.
  15. Aug 27: Mini-Quiz 2
  16. Aug 28: Sub-gradients and sub-differentials. Affine transformations of convex functions.
  17. Aug 29: Optimisation problems. Convexity and local minima.
  18. Aug 30: Equivalent optimisation problems.
  19. Sep 3: Quiz 1.
  20. Sep 4. Lipschitz continuity.
  21. Sep 5. The brute force algorithm for zero-order oracles and Lipschitz continuous objectives.
  22. Sep 6. Lipschitz smoothness.
  23. Sep 10. Linear Algebra recap. Spectral theorem. Gradient descent on Quadratics.
  24. Sep 12. Tutorial
  25. Sep 13. Mini-Quiz 3
  26. Sep 17. Gradient Descent for convex smooth functions.
  27. Sep 18. Gradient descent for smooth quadratics- error analysis.
  28. Sep 19. Dynamic step-size approaches. Full relaxation and Armijo-Goldstein approaches.
  29. Sep 20. Gradient descent for convex Lipschitz functions -- (1/sqrt(T)) rate
  30. Sep 24. Conjugate directions method.
  31. Sep 25. Conjugate gradient method.
  32. Sep 26. Convergence of conjugate gradient method.
  33. Sep 27. TQ 4.
  34. Oct 1. Conjugate gradient descent, wrap up.
  35. Oct 3. Discussion/Demo/Tutorial
  36. Oct 4: Class cancelled due to popular request.
  37. Oct 9: Newton and Quasi-Newton method
  38. Oct 10: Quasi-Newton methods contd.
  39. Oct 11: Tutorial.
  40. Oct 15: Quiz 2.
  41. Oct 16: Projected gradient descent
  42. Oct 17: Frank-Wolfe Algorithm.
  43. Oct 18: Constrained optimisation algorithms: wrap up
  44. Oct 22: Constrained optimisation theory
  45. Oct 23: KKT conditions
  46. Oct 24: KKT continued
  47. Oct 25: Programming assignment discussion
  48. Oct 29: Mini-quiz 5
  49. Oct 30: Lagrangian Duality
  50. Oct 31: Weak and Strong duality
  51. Nov 1: Dual ascent.
  52. Nov 5: Duality wrap-up.
  53. Nov 6: Semi-definite programs with applications. Approximate max-cut.
  54. Nov 7: CVXOPT usage.
  55. Nov 8: CVXOPT examples.
  56. Nov 13: Mini-quiz 6
  57. Nov 22: Final exam.

Material:

Disclaimer: the notes are not guaranteed to be complete, and free of error. Use at own risk.

July 31: Notes on analysis, and single variable calculus.

Aug 1-2: Notes on multivariable calculus.

Aug 6-7: Notes on subspaces, affine spaces, cones and convex sets.

Aug 6 -14: Boyd and Vandenberghe: Chapter 2 (except section on generalised inequalities, dual cones and subsection on perspective functions)

Aug 21-28 : Boyd and Vandenberghe: Chapter 3.1, 3.2. Wiki page for sub-gradients.

Aug 29 - Sep 6: Notes on solution concepts and Lipschitzness.

Sep 10 - Sep 19: Notes on gradient descent. Refer Bubeck for extra details.

Sep 24 - Sep 26 : Notes on conjugate gradient. Painless conjugate gradient : Reference. Refer Nocedal & Wright for extra details.

Oct 9-10: Notes on Newton and Quasi-Newton. Refer Nocedal and Wright for extra details.

Oct 16-18 : Notes on constrained optimisation algorithms. Refer Bubeck for details.

Oct 22-24 : Notes on KKT conditions. References: Nocedal & Wright, Boyd & Vandenberghe.

Oct 30 - Nov 5 : Notes on Lagrangian duality. References: Nocedal & Wright, Boyd & Vandenberghe.

Nov 6: SDP Relaxation for maxcut. CMU lecture notes.

Nov 7,8: Class notes on CVXOPT reduction. User guide to cone programs.

Worksheets:

Worksheet 1: Calculus.

Worksheet 2 is problems from Chapter 2 of Boyd and Vandenberghe. In particular: 2.5, 2.6, 2.7, 2.8, 2.9, 2.12, 2.14, 2.15, 2.16, 2.20

Worksheet 3 is problems from chapter 3 of Boyd and Vandenberghe, corresponding to sections 3.1 and 3.2. In particular: 3.1, 3.2, 3.3, 3.6, 3.7, 3.8, 3.11*, 3.12, 3.13, 3.14*, 3.15*, 3.16, 3.17*, 3.18*, 3.21, 3.24, 3.29 (assume f is convex), 3.32, 3.34*.

Worksheet 4: Solution concepts and Lipschitzness.

Worksheet 5: Gradient Descent

Worksheet 6: Conjugate gradient descent.

Worksheet 7: Newton and Quasi-Newton methods.

Worksheet 8: KKT conditions, Constrained optimisation algos

Worksheet 9: Lagrangian duality, CVXOPT reductions

Jupyter Notebooks:

Notebook Illustrating Linear algebra concepts and Gradient descent for one-dimensional functions.

Notebook illustrating convergence of Gradient descent for a 2D-linear regression loss.