Master Class

Bridging the Gap between

Machine Learning and Optimization

June 20, 2022

Chairs: Adam Elmachtoub and Elias Khalil

List of confirmed speakers:

Facebook AI Research

Talk title: Differentiable optimization-based modeling for machine learning

Talk abstract: This talk tours the foundations and applications of optimization-based models for machine learning. Optimization is a widely-used modeling paradigm for solving non-trivial reasoning operations and brings precise domain-specific modeling priors into end-to-end machine learning pipelines that are otherwise typically large parameterized black-box functions. We will discuss how to integrate optimization as a differentiable layer and start simple with constrained, continuous, convex problems in Euclidean spaces. We will then move onto active research topics that expand beyond these core components into non-convex, non-Euclidean, discrete, and combinatorial spaces. Throughout all of these, we will consider applications in control, reinforcement learning, and vision.

Bio: Brandon Amos ( is a research scientist at Facebook AI Research in New York City with research interests in machine learning and optimization, recently with a focus on reinforcement learning, control, optimal transport, and geometry.

Carnegie Mellon University

Talk title: Optimization-in-the-loop AI for energy and climate

Abstract: Addressing climate change will require concerted action across society, including the development of innovative technologies. While methods from artificial intelligence (AI) and machine learning (ML) have the potential to play an important role, these methods often struggle to contend with the physics, hard constraints, and complex decision-making processes that are inherent to many climate and energy problems. To address these limitations, I present the framework of "optimization-in-the-loop AI," and show how it can enable the design of AI models that explicitly capture relevant constraints and decision-making processes. For instance, this framework can be used to design learning-based controllers that provably enforce the stability criteria or operational constraints associated with the systems in which they operate. It can also enable the design of task-based learning procedures that are cognizant of the downstream decision-making processes for which a model’s outputs will be used. By significantly improving performance and preventing critical failures, such techniques can unlock the potential of AI and ML for operating low-carbon power grids, improving energy efficiency in buildings, and addressing other high-impact problems of relevance to climate action.

Bio: Priya Donti is a Ph.D. student in Computer Science and Public Policy at Carnegie Mellon University. She is also a co-founder and chair of Climate Change AI, an initiative to catalyze impactful work in climate change and machine learning. Her work focuses on machine learning for forecasting, optimization, and control in high-renewables power grids. Specifically, her research explores methods to incorporate the physics and hard constraints associated with electric power systems into deep learning models. Priya is a recipient of the MIT Technology Review Innovators Under 35 award, the Siebel Scholarship, the U.S. Department of Energy Computational Science Graduate Fellowship, and best paper awards at ICML (honorable mention), ACM e-Energy (runner-up), PECI, the Duke Energy Data Analytics Symposium, and the NeurIPS workshop on AI for Social Good.

University of California, Berkeley

Talk title: Learning, Optimization, and Generalization in the Predict-then-Optimize Setting

Talk abstract: In the predict-then-optimize setting, the parameters of an optimization task are predicted based on contextual features and it is desirable to leverage the structure of the underlying optimization task when training a machine learning model. A natural loss function in this setting is based on considering the cost of the decisions induced by the predicted parameters, in contrast to standard measures of prediction error. While directly optimizing this loss function is computationally challenging, we propose the use of a novel convex surrogate loss function, which we prove is statistically consistent under mild conditions.  We also provide an assortment of novel generalization bounds, including bounds based on a combinatorial complexity measure and substantially improved bounds under an additional strong convexity assumption. Finally, we discuss extensions and opportunities for further developing new results and methodologies in the predict-then-optimize setting. This talk is based on joint work with Othman El Balghiti, Adam Elmachtoub, Ambuj Tewari, and Heyuan Liu.

Bio: Paul Grigas is an assistant professor of Industrial Engineering and Operations Research at the University of California, Berkeley. Paul’s research interests are in large-scale optimization, statistical machine learning, and data-driven decision making. He is also broadly interested in the applications of data analytics, and he has worked on applications in online advertising. Paul’s research is funded by the National Science Foundation including an NSF CRII Award. Paul was awarded 1st place in the 2020 INFORMS Junior Faculty Interest Group (JFIG) Paper Competition, the 2015 INFORMS Optimization Society Student Paper Prize, and an NSF Graduate Research Fellowship. He received his B.S. in Operations Research and Information Engineering (ORIE) from Cornell University in 2011, and his Ph.D. in Operations Research from MIT in 2016.

KU Leuven

Talk title: Perception- and Preference-based Constraint Solving

Talk abstract: Industry and society are increasingly automating processes, which requires solving constrained optimisation problems. Increasingly, part of the problem specification is only implicitly defined and needs to be inferred from data.

In this talk, we more specifically focus on two settings: where part of the input is provided as images of objects, and hence the need to integrate perception and object detection with constaint solving; and a setting in which the utility function, the preference function, needs to be inferred from user interaction; with applications in vehicle routing and scheduling. What both settings have in common, is that the contrained optimisation is turned into a joint inference problem over the predictions. This requires a succesful integration of machine learning and constrained optimisation, which raises challenges both from the learning side and the solving side and their hybredization.

We highlight recent solutions from our lab and the wider research field. This hybredisation has the potential to capture both the problem structure and more subjective aspects such as human preferences and changing environments. The ultimate goal is to make constrained optimisation techniques more intelligent and human-aware.

Bio: Tias Guns is Associate Professor at the DTAI lab of KU Leuven, in Belgium. His research is at the intersection of machine learning and combinatorial optimisation. Constraint solving is a key technology for solving scheduling, planning and configuration problems across all industries.

Tias' expertise is in the hybridisation of machine learning systems with constraint solving systems, more specifically building constraint solving systems that reason both on explicit knowledge as well as knowledge learned from data. For example learning the preferences of planners in vehicle routing, and solving new routing problems taking both operational constraints and learned human preferences into account; or building energy price predictors specifically for energy-aware scheduling, and planning maintenance crews based on expected failures.

Carnegie Mellon University and Harvard University

Talk title: Decision-focused Learning: Data-driven Decisions by Melding Optimization and Machine Learning

Talk abstract: Many applications of operations research and artificial intelligence span the pipeline from data, to predictive models, to decisions. These components are typically approached separately: a machine learning model is first trained via a loss function measuring accuracy, and then its predictions input into an optimization algorithm which produces a decision. However, for difficult learning problems, all predictive models are imperfect and the choice of loss implicitly trades off different errors. Standard losses are often misaligned with the end goal: to make the best decisions possible. This talk introduces a framework for decision-focused learning, where a predictive model is directly trained to induce good decisions via the optimization algorithm, removing the intermediate loss function. We introduce techniques to formulate differentiable proxies for combinatorial optimization problems which can be integrated into the training loop for a machine learning model. These proxies include both known continuous relaxations as well as methods for automatically learning surrogate relaxations which can be efficiently solved and differentiated through. Training using these differentiable solvers focuses ML models on making predictions which induce good downstream decisions.

Bio: Bryan Wilder is Schmidt Science Fellow at Carnegie Mellon University and Harvard School of Public Health. In Fall 2022, he will join CMU as an Assistant Professor in the Machine Learning Department. His research focuses on the intersection of optimization, machine learning, and social networks, motivated by applications to public health. His work has received or been nominated for best paper awards at ICML and AAMAS, and received second place in the INFORMS Doing Good with Good OR competition.

UC Berkeley and University of Southern California

Talk title: Robustness and Causal Inference for Decisions: unobserved confounders, and numerical evaluation of debiasing adjustments

Talk abstract: We discuss two different ways of using robustness in causal inference for data-driven decision-making: first, global sensitivity analysis via optimization to robustify estimators against unobserved confounders, and second, using local perturbations of statistical functionals (influence functions) to "automatically" derive Gateaux derivative adjustments. In the first part of the talk, we discuss learning personalized decision policies from observational data while accounting for possible unobserved confounding. Because policy value and regret may not be point-identifiable, we study a method that minimizes the worst-case estimated regret of a candidate policy against a baseline policy over an uncertainty set for propensity weights that controls the extent of unobserved confounding. We prove generalization guarantees that ensure safety and develop efficient algorithmic solutions to compute this minimax-optimal policy. We consider a case study on personalizing hormone replacement therapy based on observational data, in which we validate our results on a randomized experiment. And, we discuss extensions to more complicated decision-making settings such as infinite-horizon reinforcement learning. In the second part of the talk, we build on recent work proposing numerical differentiation for evaluating Gateaux derivatives for causal inference, hence connecting interpretations of influence functions as qualitative sensitivity analysis to their role in deriving bias adjustments (i.e. doubly robust/orthogonalized estimators). We focus on the case where probability distributions are not known a priori but need also to be estimated from data, leading to empirical Gateaux derivatives, and study relationships between empirical, numerical, and analytical Gateaux derivatives. Starting with a case study of counterfactual mean estimation, we verify the exact relationship between finite-differences and the analytical Gateaux derivative. We then derive requirements on the rates of numerical approximation in perturbation and smoothing that preserve statistical benefits of the resulting estimation procedure via one-step adjustment, such as rate-double-robustness. We study more complicated functionals such as dynamic treatment regimes and the linear-programming formulation for policy optimization infinite-horizon Markov decision processes. A common theme of both parts is robustness: in the first part, obtaining substantive decision-theoretic guarantees, and in the second part, obtaining improved bias-robust estimation via local perturbation.

Bio: Angela Zhou is an Assistant Professor of Data Sciences and Operations at the Marshall School of Business, University of Southern California. Previously, she was a Foundations of Data Science postdoc at UC Berkeley and research fellow at the Simons Institute. She obtained her PhD from Cornell/Cornell Tech in Operations Research and Information Engineering. She works at the intersection of statistical machine learning and operations research in order to inform reliable data-driven decision-making in view of fundamental practical challenges that arise from realistic information environments. Her PhD developed robust causal inference for decision-making and credible performance evaluation for algorithmic fairness and disparity assessment.