Discrete Optimization and Machine Learning
Sebastian Pokutta (WiSe 2025 / Seminar)
Sebastian Pokutta (WiSe 2025 / Seminar)
Both Machine Learning methods as well as Discrete Optimization methods are important tools in many real-world applications. In this course we will primarily study the interplay of integer programming and, more broadly, discrete optimization methods and machine learning.
The format of the course is research-oriented requiring active participation of the students. The course will be a mix of student presentation, discussions, and project work. In the first few weeks in-class projects will be defined.
Prerequisites: Linear Algebra, Analysis, and Discrete Optimization (ADM I/ADM II)
Registration: Together with paper selection after first meeting (seminar outline)
Participation requirements:
Students are expected to individually
Write a report (5 page minimum in Latex) about the paper and the contribution from the students themselves.
Send your reports to Jannis Halbey
During the semester, present their plans for the final report and presentation in a shorter 5-minute presentation.
Give a final presentation to present their findings (details discussed in the first meeting).
Paper/project selection:
Students choose one of the papers below to work on
Up to 2 students can work on the same paper
Assignment is first come, first served
Send paper choice to Jannis Halbey
For fairness reasons we only accept selections after 14.10.2025 12:00
Contribution:
Every student is expected to make a contribution to the given topic and not just reproduce the results. The aim is not to obtain groundbreaking new results, but to get an impression of scientific work. Furthermore, you also need to have a very sound understanding of the subject in order to contribute something yourself. Contributions can be an extension of the original algorithm, a new theoretical proof or a comparison with new research.
Examples for strong contributions:
COIL: A Deep Architecture for Column Generation
Identify the OptNet-Layer as one main bottleneck
Examine Lagrangian Dual Framework from Ferdinando Fioretto et al.: Lagrangian Duality for Constrained Deep Learning as alternative solution
Implement modification and evaluate empirically
Sparse Adversarial and Interpretable Attack Framework
Identify vanilla Frank-Wolfe Algorithm as potential point for improvement
Examine the variant Blended Pairwise Conditional Gradient
Implement modification and evaluate empirically
PEP: Parameter Ensembles by Perturbation
Comparison with state-of-the-art Deep Ensembles
Examine advantages and disadvantages of both methods
Implement combination of both approaches and evaluate empirically
Timeline:
Seminar outline 13.10.2025 10:00 s.t. (online)
Open paper selection 14.10.2025 12:00 (mail to Jannis Halbey)
5-min presentations 18.11.2025 11:00 s.t. in room Mar 0.017
Report submission deadline 08.02.2026 23:59
Final presentations 09.02.2026 10:00 s.t. at Zuse Institute Berlin
Deep Learning / LLMs
Sun et al.
A Simple and Effective Pruning Approach for Large Language Models
https://arxiv.org/abs/2306.11695
Deep Learning / LLMs
Sun et al.
A Simple and Effective Pruning Approach for Large Language Models
https://arxiv.org/abs/2306.11695
Deep Learning / LLMs
Zimmer et al.
Sparse Model Soups: A Recipe for Improved Pruning via Model Averaging
https://arxiv.org/abs/2306.16788
Deep Learning / LLMs
Zimmer et al.
PERP: Rethinking the Prune-Retrain Paradigm in the Era of LLMs
https://arxiv.org/abs/2312.15230
Deep Learning / LLMs
Jan Hendrik Kirchner, Yining Chen, Harri Edwards, Jan Leike, Nat McAleese, Yuri Burda
Prover-Verifier Games improve legibility of LLM outputs
https://arxiv.org/abs/2407.13692
Deep Learning / LLMs
Frantar et al.
SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot
https://arxiv.org/abs/2301.00774
Deep Learning / Neural Fields
Dupont, E., Kim, H., Eslami, S. M. A., Rezende, D., & Rosenbaum, D.
From data to functa: Your data point is a function and you can treat it like one
https://arxiv.org/abs/2201.12204
Diffusion Models / Sampling
Richter and Berner
Improved sampling via learned diffusions
https://arxiv.org/abs/2307.01198
Optimization
Okanovic, P., Kwasniewski, G., Labini, P. S., Besta, M., Vella, F., & Hoefler, T.
High performance unstructured spmm computation using tensor cores.
https://arxiv.org/abs/2408.11551
Optimization
David Applegate, Mateo Díaz, Oliver Hinder, Haihao Lu, Miles Lubin, Brendan O'Donoghue, Warren Schudy
PDLP: A Practical First-Order Method for Large-Scale Linear Programming
https://arxiv.org/abs/2501.07018
Optimization
Ye-Chao Liu, Jannis Halbey, Sebastian Pokutta, Sébastien Designolle
A Unified Toolbox for Multipartite Entanglement Certification
https://arxiv.org/abs/2507.17435
Optimization
Matteo Vandelli, Francesco Ferrari, Daniele Dragoni
Parallel splitting method for large-scale quadratic programs
https://arxiv.org/abs/2503.16977
Optimization
A. A. Vyguzov, F. S. Stonyakin
Adaptive Variant of Frank-Wolfe method for Relative Smooth Convex Optimization Problems
https://arxiv.org/pdf/2405.12948
Boosting
Kasper Green Larsen, Martin Ritzert
Optimal Weak to Strong Learning