About the Course
Days and Time: Monday & Wednesday 4:00pm - 5:15pm
Location: AD 2.232
Office Hours: Wednesdays 3 pm - 3:50 pm (my office: ECSS 3.405), Monday 3 pm - 3:50 pm (by appointment)
Along with this web-page, the lecture slides, assignments, assignment solutions and student projects will be posted on this page: https://github.com/rishabhk108/OptimizationML
Continuous and Discrete Optimization are two important pillars of machine learning. Continuous optimization typically occurs in learning model parameters while discrete optimization problems appear in inference and auxiliary tasks such as feature selection, data subset selection, model compression etc.
In the first part of this course, we will cover continuous optimization with applications in machine learning. Topics to discuss will include Convexity, Gradient Methods, Proximal algorithms, Stochastic and Online Variants of mentioned methods, Coordinate Descent Methods, Subgradient Methods, Non-Convex Optimization, Frank Wolfe, Accelerated Methods, Lagrange and Fenchel Duality, Second-Order Methods, and Quasi-Newton Methods. We will ground all these algorithms with applications and loss functions in machine learning (starting from logistic regression, linear regression, svms, matrix factorization right up to deep neural networks). The second part of this course will cover the fundamentals of discrete optimization. We will start with basic forms of combinatorial optimization (knapsack, s-t cuts/paths, matchings and matroids) and then discuss submodular functions and their applications. We will cover algorithms for submodular minimization and maximization under various combinatorial constraints. We will also discuss discrete probabilistic models such as Determinantal Point Processes (DPPs). We will ground all these algorithmic and theoretical results with real world applications in feature selection, summarization and diversified search, structured prediction, data subset selection and model compression.
Course Instructor: Rishabh Iyer
TA: Omeed Ashtiani
Course Syllabus (What will be covered)
Continuous Optimization
- Lecture 1: Introduction and Course Overview
- Lecture 2: Basics of Continuous Optimization
- Lecture 3 & 4: Convexity and Convex Optimization
- Lecture 5: Convexity and Convex Optimization
- Continued Lecture 6 & 7: Gradient Descent and Family
- Lecture 8: Accelerated Gradient Descent and Practical Aspects of Gradient Descent
- Lecture 9: Proximal and Projected Gradient Descent
- Lecture 10: Project Proposals by Students (No Lecture)
- Lecture 11 and 12: Newton, Quasi Newton and Second Order Methods
- Lecture 13: Conditional Gradient Descent
- Lecture Extra: Practical Implementational Aspects of Gradient Descent and Family (For Assignment 2)
- Lecture 14: Coordinate Gradient Descent Lecture 15: Stochastic Gradient Descent and Family
- Lecture 16: SGD and Adaptive Gradient Methods
- Lecture 17: Non Convex Optimization
- Lecture 18: Duality
Discrete Optimization
- Lecture 19 & 20: Introduction to Submodular Optimization: Definitions and Examples
- Lecture 21 & 22: Convex and Concave Aspects of Submodular Functions, Submodular Optimization: Minimization and Maximization, Applications
Evaluation
- Assignments (50%)
- Project (50%)
Assignments
Assignments will comprise of a mix of practical and theoretical. Theoretical problems will be simple and of the form of proving certain functions are convex/non-convex, computing gradients etc. Practical problems will comprise of implementing optimization algorithms and loss functions in python. Note that the practical assignments will follow a sequence (i.e. future assignments will depend on implementations from the past,
Project
This course also will have a mini-project. The mini project could involve practical component, i.e. picking a certain real world problem, modeling the problem and optimizing the loss function with different algorithms and drawing insights on its performance. Alternatively, the mini-project could also be a survey on a particular class of optimization algorithms and theoretical results (for example convergence of SGD or GD style algorithms for certain non-convex problems).
Lecture Slides and Assignments
Lecture Slides and Assignments will be posted on the GitHub page: https://github.com/rishabhk108/OptimizationML
Links to related courses
Courses on Continuous Optimization
- CMU 10-725
- Berkeley EE-227C
- EPFL: Optimization in ML (Prof. Martin Jaggi)
- Basic and Advanced Machine Learning at UBC (Prof. Mark Schmidt)
Courses on Combinatorial/Discrete Optimization
Material (Textbooks)
- Dive into Deep Learning, by Zhang, Lipton, Li and Smola
- Convex Optimization: Algorithms and Complexity, by SĂ©bastien Bubeck
- Convex Optimization, Stephen Boyd and Lieven Vandenberghe
- Introductory Lectures on Convex Optimization, Yurii Nesterov
- A Course in Combinatorial Optimization, Alexander Schrijver
- Learning with Submodular Functions: A Convex Optimization Perspective, Francis Bach
- Schrijver, Alexander, Combinatorial optimization: polyhedra and efficiency, Vol. 24. Springer Science & Business Media, 2003.
- Fujishige, Satoru. Submodular functions and optimization. Vol. 58. Elsevier, 2005.