Prerequisites: Basic Linear Algebra and Basic Programming Skills (MATH 415 and ECE 190) Instructor: Prof. R. Srikant, rsrikant@illinois.edu Prof. Srikant’s Office Hours: 2:153:30 pm Tue, 107 CSL. TAs: Khaled Alshehri (kalsheh2@illinois.edu) and Philp Pare (philpare@illinois.edu) Khaled's Office Hours: 4:305:30 pm Wed, 4034 ECEB Philip's Office Hours: 1:302:30 pm Wed, 3036 ECEB Lectures: 1112:20 TuTh in Room 2017 ECEB Website: https://sites.google.com/site/ece490spring2017/ Spring Break: March 18March 26 Last day of instruction: May 3 FInal exam schedule: http://registrar.illinois.edu/spring2017schedulingguidelinespublic Grading:
Topics (as time permits): · Applications: Least squares (example: stock market), Least squares with regularization (example: gene regulatory networks), SVM (example: image, text classification), neural networks (example: image classification, selfdriving cars, speech recognition), power allocation in wireless networks, Internet congestion control, Netflix recommendation problem · Inf vs min, sup vs max, global vs local optimality, existence of optimal solutions and the Weierstrass extreme value theorem, convex sets, convex and concave functions, global and local maxima for convex functions, uniqueness for strictly convex functions, tests for convex/concave functions and strongly convex/concave functions using Hessian matrix, gradientbased characterization of convex and concave functions · Firstorder necessary and secondorder sufficient conditions for local minimum · Gradient descent, descent lemma, convergence analysis for a fixed step size for functions with Lipschitzcontinuous derivatives, for convex functions and for strongly convex functions, the role of the condition number and a quadratic objective example (least squares is a special case); variable step sizes, Armijo’s rule and its convergence; applications to neural networks and the back propagation algorithm · Newton’s method and improvement in convergence rate locally; problems with implementing Newton’s method and some fixes · Constrained optimization, projected gradient descent, and its convergence for a fixed step size · Constrained optimization with equality and inequality constraints; Lagrange multipliers, the KKT necessary conditions and the KKT sufficient conditions; application to power control in wireless networks and the waterfilling solution; sensitivity and Lagrange multipliers · Penalty, barrier function methods, and their convergence · Duality: the primal and dual problems; strong and weak duality and the Slater conditions, minimax characterization, linear programming and the economic interpretation of the dual; applications of strong duality to SVM · Subgradients, subgradient descent and its convergence; application of subgradient descent to the dual problem; decomposition and its application to Internet congestion control · Conjugate gradient method to avoid inverting the Hessian · Proximal gradient descent and application to sparse regression (leads to an algorithm called ISTA for the regularized least squares problem) · Coordinate descent for a differentiable convex plus a convex function, and its convergence. Application to LASSO · Augmented Lagrangian methods (aka method of multipliers), ADMM · Sequential Quadratic Programming
