CS 6301: Optimization in Machine Learning

Prof. Rishabh Iyer

CS Department UT Dallas

About the Course

Days and Time: Monday & Wednesday 4:00pm - 5:15pm

Location: AD 2.232

Office Hours: Wednesdays 3 pm - 3:50 pm (my office: ECSS 3.405), Monday 3 pm - 3:50 pm (by appointment)

Along with this web-page, the lecture slides, assignments, assignment solutions and student projects will be posted on this page: https://github.com/rishabhk108/OptimizationML

Link to Coursebook

Continuous and Discrete Optimization are two important pillars of machine learning. Continuous optimization typically occurs in learning model parameters while discrete optimization problems appear in inference and auxiliary tasks such as feature selection, data subset selection, model compression etc.

In the first part of this course, we will cover continuous optimization with applications in machine learning. Topics to discuss will include Convexity, Gradient Methods, Proximal algorithms, Stochastic and Online Variants of mentioned methods, Coordinate Descent Methods, Subgradient Methods, Non-Convex Optimization, Frank Wolfe, Accelerated Methods, Lagrange and Fenchel Duality, Second-Order Methods, and Quasi-Newton Methods. We will ground all these algorithms with applications and loss functions in machine learning (starting from logistic regression, linear regression, svms, matrix factorization right up to deep neural networks). The second part of this course will cover the fundamentals of discrete optimization. We will start with basic forms of combinatorial optimization (knapsack, s-t cuts/paths, matchings and matroids) and then discuss submodular functions and their applications. We will cover algorithms for submodular minimization and maximization under various combinatorial constraints. We will also discuss discrete probabilistic models such as Determinantal Point Processes (DPPs). We will ground all these algorithmic and theoretical results with real world applications in feature selection, summarization and diversified search, structured prediction, data subset selection and model compression.

Course Instructor: Rishabh Iyer

TA: Omeed Ashtiani

Course Syllabus (What will be covered)

Continuous Optimization

  • Lecture 1: Introduction and Course Overview
  • Lecture 2: Basics of Continuous Optimization
  • Lecture 3 & 4: Convexity and Convex Optimization
  • Lecture 5: Convexity and Convex Optimization
  • Continued Lecture 6 & 7: Gradient Descent and Family
  • Lecture 8: Accelerated Gradient Descent and Practical Aspects of Gradient Descent
  • Lecture 9: Proximal and Projected Gradient Descent
  • Lecture 10: Project Proposals by Students (No Lecture)
  • Lecture 11 and 12: Newton, Quasi Newton and Second Order Methods
  • Lecture 13: Conditional Gradient Descent
  • Lecture Extra: Practical Implementational Aspects of Gradient Descent and Family (For Assignment 2)
  • Lecture 14: Coordinate Gradient Descent Lecture 15: Stochastic Gradient Descent and Family
  • Lecture 16: SGD and Adaptive Gradient Methods
  • Lecture 17: Non Convex Optimization
  • Lecture 18: Duality

Discrete Optimization

  • Lecture 19 & 20: Introduction to Submodular Optimization: Definitions and Examples
  • Lecture 21 & 22: Convex and Concave Aspects of Submodular Functions, Submodular Optimization: Minimization and Maximization, Applications

Evaluation

  • Assignments (50%)
  • Project (50%)

Assignments

Assignments will comprise of a mix of practical and theoretical. Theoretical problems will be simple and of the form of proving certain functions are convex/non-convex, computing gradients etc. Practical problems will comprise of implementing optimization algorithms and loss functions in python. Note that the practical assignments will follow a sequence (i.e. future assignments will depend on implementations from the past,

Project

This course also will have a mini-project. The mini project could involve practical component, i.e. picking a certain real world problem, modeling the problem and optimizing the loss function with different algorithms and drawing insights on its performance. Alternatively, the mini-project could also be a survey on a particular class of optimization algorithms and theoretical results (for example convergence of SGD or GD style algorithms for certain non-convex problems).

Lecture Slides and Assignments

Lecture Slides and Assignments will be posted on the GitHub page: https://github.com/rishabhk108/OptimizationML

Links to related courses

Courses on Continuous Optimization

Courses on Combinatorial/Discrete Optimization

Material (Textbooks)