Discrete Optimization and Machine Learning
Sebastian Pokutta (SoSe 2025 / Seminar)
Sebastian Pokutta (SoSe 2025 / Seminar)
Both Machine Learning methods as well as Discrete Optimization methods are important tools in many real-world applications. In this course we will primarily study the interplay of integer programming and, more broadly, discrete optimization methods and machine learning.
The format of the course is research-oriented requiring active participation of the students. The course will be a mix of student presentation, discussions, and project work. In the first few weeks in-class projects will be defined.
Prerequisites: Linear Algebra, Analysis, and Discrete Optimization (ADM I/ADM II)
Registration: Together with paper selection after first meeting (seminar outline)
Participation requirements:
Students are expected to individually
Write a report (5 page minimum in Latex) about the paper and the contribution from the students themselves.
Send your reports to Jannis Halbey
During the semester, present their plans for the final report and presentation in a shorter 5-minute presentation.
Give a final presentation to present their findings (details discussed in the first meeting).
Paper/project selection:
Students choose one of the papers below to work on
Up to 2 students can work on the same paper
Assignment is first come, first served
Send paper choice to Jannis Halbey
For fairness reasons we only accept selections after 23.04.2025 12:00
Contribution:
Every student is expected to make a contribution to the given topic and not just reproduce the results. The aim is not to obtain groundbreaking new results, but to get an impression of scientific work. Furthermore, you also need to have a very sound understanding of the subject in order to contribute something yourself. Contributions can be an extension of the original algorithm, a new theoretical proof or a comparison with new research.
Examples for strong contributions:
COIL: A Deep Architecture for Column Generation
Identify the OptNet-Layer as one main bottleneck
Examine Lagrangian Dual Framework from Ferdinando Fioretto et al.: Lagrangian Duality for Constrained Deep Learning as alternative solution
Implement modification and evaluate empirically
Sparse Adversarial and Interpretable Attack Framework
Identify vanilla Frank-Wolfe Algorithm as potential point for improvement
Examine the variant Blended Pairwise Conditional Gradient
Implement modification and evaluate empirically
PEP: Parameter Ensembles by Perturbation
Comparison with state-of-the-art Deep Ensembles
Examine advantages and disadvantages of both methods
Implement combination of both approaches and evaluate empirically
Timeline:
Seminar outline 22.04.2025 9:00 s.t. (online)
Open paper selection 23.04.2025 12:00 (mail to Jannis Halbey)
5-min presentations 12.06.2025 12:00 s.t. in room H 0106
Report submission deadline 06.07.2025 23:59
Final presentations 07.07.2025 9:00 s.t. at Zuse Institute Berlin
Deep Learning / LLMs
Sun et al.
A Simple and Effective Pruning Approach for Large Language Models
https://arxiv.org/abs/2306.11695
Deep Learning / LLMs
Lu et al.
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery
https://arxiv.org/abs/2408.06292
Deep Learning / LLMs
Zimmer et al.
PERP: Rethinking the Prune-Retrain Paradigm in the Era of LLMs
https://arxiv.org/abs/2312.15230
Deep Learning / Optimization
Aaron Defazio, Xingyu (Alice) Yang, Harsh Mehta, Konstantin Mishchenko, Ahmed Khaled, Ashok Cutkosky
The Road Less Scheduled
https://arxiv.org/abs/2405.15682
LLMs / Federated Learning
Sami Jaghouar, Jack Min Ong, Johannes Hagemann
OpenDiLoCo: An Open-Source Framework for Globally Distributed Low-Communication Training
https://arxiv.org/abs/2407.07852
LLMs / Type theory
Guoxiong Gao, Haocheng Ju, Jiedong Jiang, Zihan Qin, Bin Dong
A Semantic Search Engine for Mathlib4
https://arxiv.org/abs/2403.13310
Machine Learning
Sadiku, S., Wagner, M., Nagarajan, S. G., and Pokutta, S.
S-CFE: Simple Counterfactual Explanations
https://arxiv.org/abs/2410.15723
Machine Learning
Taco S. Cohen, Max Welling
Group Equivariant Convolutional Networks
https://arxiv.org/abs/1602.07576
Machine Learning
Mundinger, K., Zimmer, M., Kiem, A., Spiegel, C., Pokutta, S.
Neural Discovery in Mathematics: Do Machines Dream of Colored Planes?
https://arxiv.org/abs/2501.18527
Machine Learning
Elias Frantar, Saleh Ashkboos, Torsten Hoefler, Dan Alistarh
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers
https://arxiv.org/abs/2210.17323
Machine Learning
Larsen, K., Ritzert, M.
Optimal Weak to Strong Learning
https://arxiv.org/pdf/2206.01563
Machine Learning
Yi-Jun Luo, Jin-Ming Liu, Chengjie Zhang
Detecting genuine multipartite entanglement via machine learning
https://arxiv.org/abs/2311.17548
Machine Learning
Marcel Kollovieh, Marten Lienen, David Lüdke, Leo Schwinn, Stephan Günnemann
Flow Matching with Gaussian Process Priors for Probabilistic Time Series Forecasting
https://arxiv.org/abs/2410.03024
Optimization
Yongzheng Dai, Chen Chen
Serial and Parallel Two-Column Probing for Mixed-Integer Programming
https://arxiv.org/abs/2408.16927
Optimization
David Applegate, Mateo Díaz, Oliver Hinder, Haihao Lu, Miles Lubin, Brendan O'Donoghue, Warren Schudy
PDLP: A Practical First-Order Method for Large-Scale Linear Programming
https://arxiv.org/abs/2501.07018
Optimization
Rui Chen, Oktay Gunluk, Andrea Lodi
Recovering Dantzig-Wolfe Bounds by Cutting Planes
https://arxiv.org/abs/2301.13149
Optimization
Fabian Pedregosa, Geoffrey Négiar, Armin Askari. Martin Jaggi
Linearly Convergent Frank-Wolfe with Backtracking Line-Search
https://arxiv.org/pdf/1806.05123
Optimization
Le Phuc Thinh, Michele Dall'Arno, Valerio Scarani
Worst-case Quantum Hypothesis Testing with Separable Measurements
https://arxiv.org/abs/1910.10954