Discrete Optimization and Machine Learning
Ksenia Bestuzheva (SoSe 2024 / Seminar)
Both Machine Learning methods as well as Discrete Optimization methods are important tools in many real-world applications. In this course we will primarily study the interplay of integer programming and, more broadly, discrete optimization methods and machine learning.
The format of the course is strongly research-oriented requiring active participation of the students. The course will be a mix of student presentation, discussions, and project work. In the first few weeks in-class projects will be defined.
Course organization
Prerequisites: Linear Algebra, Analysis, and Discrete Optimization (ADM I/ADM II)
Registration: Together with paper selection after first meeting (seminar outline)
Participation requirements:
Students are expected to individually
Write a report (5 page minimum in Latex) about the paper and the contribution from the students themselves.
Send your reports to Jannis HalbeyDuring the semester, present their plans for the final report and presentation in a shorter 5-minute presentation.
Give a final presentation to present their findings (details discussed in the first meeting).
Paper/project selection:
Students choose one of the papers below to work on
Up to 2 students can work on the same paper
Assignment is first come, first served
Send paper choice to Jannis Halbey
For fairness reasons we only accept selections after 19.04.2024 04:00 pm
Contribution:
Every student is expected to make a contribution to the given topic and not just reproduce the results. The aim is not to obtain groundbreaking new results, but to get an impression of scientific work. Furthermore, you also need to have a very sound understanding of the subject in order to contribute something yourself. Contributions can be an extension of the original algorithm, a new theoretical proof or a comparison with new research.
Examples for strong contributions:
COIL: A Deep Architecture for Column Generation
Identify the OptNet-Layer as one main bottleneck
Examine Lagrangian Dual Framework from Ferdinando Fioretto et al.: Lagrangian Duality for Constrained Deep Learning as alternative solution
Implement modification and evaluate empirically
Sparse Adversarial and Interpretable Attack Framework
Identify vanilla Frank-Wolfe Algorithm as potential point for improvement
Examine the variant Blended Pairwise Conditional Gradient
Implement modification and evaluate empirically
PEP: Parameter Ensembles by Perturbation
Comparison with state-of-the-art Deep Ensembles
Examine advantages and disadvantages of both methods
Implement combination of both approaches and evaluate empirically
Timeline:
All meetings take place in the MAR building in Marchstraße 23.
18.04.2024, 2:00 pm online: Seminar outline
19.04.2024, 4:00 pm : Open paper choices (mail to Jannis Halbey)
16.05.2024, 2:00 pm in MAR 0.003: 5-minute presentations
17.07.2024, 11:59 pm : Report submission deadline
18.07.2024, 10:00 am in MAR 0.008: Final presentations
Reading material by topic (red papers are no longer available):
Deep Learning / Interpretability
Been Kim, Martin Wattenberg, Justin Gilmer, Carrie Cai, James Wexler, Fernanda Viegas, Rory Sayres
Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV)
https://arxiv.org/abs/1711.11279
Deep Learning / LLMs
Yuxin Zhang et al.
Dynamic Sparse No Training: Training-Free Fine-tuning for Sparse LLMs
https://arxiv.org/abs/2310.08915
Deep Learning / LLMs
Van der Ouderaa et al.
The LLM Surgeon
https://arxiv.org/abs/2312.17244Deep Learning / LLMs
Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen
LoRA: Low-Rank Adaptation of Large Language Models
https://arxiv.org/abs/2106.09685
Deep Learning / LLMs
Frantar & Alistarh
SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot
https://arxiv.org/abs/2301.00774
Frank–Wolfe algorithm, extrapolation
Francis Bach
On the Effectiveness of Richardson Extrapolation in Machine Learning
https://arxiv.org/abs/2002.02835
Game Theory/Minmax Optimization
Ioannis Anagnostides, Fivos Kalogiannis, Ioannis Panageas, Emmanouil-Vasileios Vlatakis-Gkaragkounis, Stephen McAleer
Algorithms and Complexity for Computing Nash Equilibria in Adversarial Team Games
https://arxiv.org/abs/2301.02129
Machine Learning
Tuan-Duy H. Nguyen, Ngoc Bui, Duy Nguyen, Man-Chung Yue, Viet Anh Nguyen
Robust Bayesian Recourse
https://arxiv.org/abs/2206.10833
Machine Learning / PDEs
Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, Anima Anandkumar
Fourier Neural Operator for Parametric Partial Differential Equations
https://arxiv.org/abs/2010.08895
Optimization
Jaeyeon Kim, Chanwoo Park, Asuman Ozdaglar, Jelena Diakonikolas, Ernest K. Ryu
Mirror Duality in Convex Optimization
https://arxiv.org/pdf/2311.17296.pdf
Optimization
Simon Lacoste-Julien, Martin Jaggi
On the Global Linear Convergence of Frank-Wolfe Optimization Variants
https://arxiv.org/abs/1511.05932
Optimization (Backpropagation)
Jérôme Bolte, Tam Le, Edouard Pauwels
Subgradient sampling for nonsmooth nonconvex minimization
https://arxiv.org/pdf/2202.13744.pdf
Optimization (MIP)
Gregor Hendel
Adaptive large neighborhood search for mixed integer programming
https://link.springer.com/article/10.1007/s12532-021-00209-7
Optimization (MIP)
Timo Berthold
RENS - the optimal rounding
https://opus4.kobv.de/opus4-zib/frontdoor/index/index/docId/1520
Optimization
Michele Conforti, Gerard Cornuejols, Aris Daniilidis, Claude Lemarechal, Jerome Malick
Cut-generating functions and S-free sets
https://www.andrew.cmu.edu/user/gc0v/webpub/2013-056.pdf
Optimization (MIP)
T Achterberg, T Koch, A Martin
Branching rules revisited
https://opus4.kobv.de/opus4-zib/files/788/ZR-04-13.pdf
Optimization (MIP)
Tobias Achterberg, Robert E. Bixby, Zonghao Gu, Edward Rothberg, Dieter Weninger
Presolve Reductions in Mixed Integer Programming
https://opus4.kobv.de/opus4-zib/files/6037/Presolve.pdf