Introduction to Optimization

Prerequisites: Basic Linear Algebra and Basic Programming Skills (MATH 415 and ECE 190)
Instructor: Prof. R. Srikant,
Prof. Srikant’s Office Hours: 2:15-3:30 pm Tue, 107 CSL. 
TAs: Khaled Alshehri ( and Philp Pare (
Khaled's Office Hours: 4:30-5:30 pm Wed, 4034 ECEB
Philip's Office Hours: 1:30-2:30 pm Wed, 3036 ECEB
Lectures: 11-12:20 TuTh in Room 2017 ECEB
Spring Break: March 18-March 26
Last day of instruction: May 3
FInal exam schedule:

  • Weekly Quizzes: 10% (Quizzes will be held every Thursday in class. Occasionally, a quiz may be canceled in which case you will be notified ahead of time by email. Each homework will be posted with solutions, so you don't have to turn in the homework. Each quiz will be based on the latest homework posted on the course website. If you miss a quiz without a valid reason, you will get zero points.  If you miss a quiz for a valid reason, the average of the rest of your quizzes will be used as the score for the missed quiz. No make-up quizzes will be provided. Examples of valid reasons include illnesses, job interviews and travel to conferences to present research papers. In all such cases, you have to provide proof that you missed the quiz for a valid reason.)
  • Programming assignments: 20% (All programming assignments must be done in python. You can use any version of python, I prefer anaconda.)
  • Two Midterm Exams: 20% each (Tuesday Feb. 28, 7:00-8:30 pm and Monday April 17, 7:00-8:30 pm, Rooms 1013, 1015 ECEB)
    •  Midterm Exam I: You are allowed to use notes handwritten on one 8.5"x11" sheet of paper (you can write on the front and back of the paper). Topics, Solutions
    •  Midterm Exam II: You are allowed to use notes handwritten on one 8.5"x11" sheet of paper (you can write on the front and back of the paper). Topics, Solutions
  • Final Exam: 30%, May 11, 7:00-10:00 pm, Rooms 2017 and 3015 ECEB. You are allowed to use notes handwritten on three 8.5"x11" sheet of paper (you can write on the front and back of the paper). Topics
  • Grading will be done separately for graduate and undergraduate students. The programming assignments, quizzes, and exams may be slightly different for graduate and undergraduate students.

Topics (as time permits): 

·       Applications: Least squares (example: stock market), Least squares with regularization (example: gene regulatory networks), SVM (example: image, text classification), neural networks (example: image classification, self-driving cars, speech recognition), power allocation in wireless networks, Internet congestion control, Netflix recommendation problem

·       Inf vs min, sup vs max, global vs local optimality, existence of optimal solutions and the Weierstrass extreme value theorem, convex sets, convex and concave functions, global and local maxima for convex functions, uniqueness for strictly convex functions, tests for convex/concave functions and strongly convex/concave functions using Hessian matrix, gradient-based characterization of convex and concave functions

·       First-order necessary and second-order sufficient conditions for local minimum

·       Gradient descent, descent lemma, convergence analysis for a fixed step size for functions with Lipschitz-continuous derivatives, for convex functions and for strongly convex functions, the role of the condition number and a quadratic objective example (least squares is a special case); variable step sizes, Armijo’s rule and its convergence; applications to neural networks and the back propagation algorithm

·       Newton’s method and improvement in convergence rate locally; problems with implementing Newton’s method and some fixes

·       Constrained optimization, projected gradient descent, and its convergence for a fixed step size

·       Constrained optimization with equality and inequality constraints; Lagrange multipliers, the KKT necessary conditions and the KKT sufficient conditions; application to power control in wireless networks and the waterfilling solution; sensitivity and Lagrange multipliers

·       Penalty, barrier function methods, and their convergence

·       Duality: the primal and dual problems; strong and weak duality and the Slater conditions, minimax characterization, linear programming and the economic interpretation of the dual; applications of strong duality to SVM

·       Subgradients, subgradient descent and its convergence; application of subgradient descent to the dual problem; decomposition and its application to Internet congestion control

·       Conjugate gradient method to avoid inverting the Hessian

·       Proximal gradient descent and application to sparse regression (leads to an algorithm called ISTA for the regularized least squares problem)

·       Coordinate descent for a differentiable convex plus a convex function, and its convergence. Application to LASSO

·       Augmented Lagrangian methods (aka method of multipliers), ADMM

·       Sequential Quadratic Programming

  •  Nestorov’s accelerated gradient method and FISTA