Discrete Optimization and Machine Learning
Sebastian Pokutta (WiSe 2024 / Seminar)
Both Machine Learning methods as well as Discrete Optimization methods are important tools in many real-world applications. In this course we will primarily study the interplay of integer programming and, more broadly, discrete optimization methods and machine learning.
The format of the course is research-oriented requiring active participation of the students. The course will be a mix of student presentation, discussions, and project work. In the first few weeks in-class projects will be defined.
Course organization
Prerequisites: Linear Algebra, Analysis, and Discrete Optimization (ADM I/ADM II)
Registration: Together with paper selection after first meeting (seminar outline)
Participation requirements:
Students are expected to individually
Write a report (5 page minimum in Latex) about the paper and the contribution from the students themselves.
Send your reports to Jannis HalbeyDuring the semester, present their plans for the final report and presentation in a shorter 5-minute presentation.
Give a final presentation to present their findings (details discussed in the first meeting).
Paper/project selection:
Students choose one of the papers below to work on
Up to 2 students can work on the same paper
Assignment is first come, first served
Send paper choice to Jannis Halbey
For fairness reasons we only accept selections after 18.10.2024 04:00 pm
Contribution:
Every student is expected to make a contribution to the given topic and not just reproduce the results. The aim is not to obtain groundbreaking new results, but to get an impression of scientific work. Furthermore, you also need to have a very sound understanding of the subject in order to contribute something yourself. Contributions can be an extension of the original algorithm, a new theoretical proof or a comparison with new research.
Examples for strong contributions:
COIL: A Deep Architecture for Column Generation
Identify the OptNet-Layer as one main bottleneck
Examine Lagrangian Dual Framework from Ferdinando Fioretto et al.: Lagrangian Duality for Constrained Deep Learning as alternative solution
Implement modification and evaluate empirically
Sparse Adversarial and Interpretable Attack Framework
Identify vanilla Frank-Wolfe Algorithm as potential point for improvement
Examine the variant Blended Pairwise Conditional Gradient
Implement modification and evaluate empirically
PEP: Parameter Ensembles by Perturbation
Comparison with state-of-the-art Deep Ensembles
Examine advantages and disadvantages of both methods
Implement combination of both approaches and evaluate empirically
Timeline:
Seminar outline 14.10.2024, 14:15 (online)
Open paper selection 15.10.2024, 15:00 (mail to Jannis Halbey)
5-min presentations 28.11.2024, 10:15 in (room tbd)
Report submission deadline 09.02.2025, 23:59
Final presentations 10.02.2025, 10:15 in (room tbd)
Reading material by topic (red papers are no longer available):
Machine Learning
Asterios Tsiourvas, Wei Sun, Georgia Perakis
Manifold-Aligned Counterfactual Explanations for Neural Networks
https://proceedings.mlr.press/v238/tsiourvas24a.htmlDeep Learning / LLMs
Yuxin Zhang et al.
Dynamic Sparse No Training: Training-Free Fine-tuning for Sparse LLMs
https://arxiv.org/abs/2310.08915Deep Learning / LLMs
Van der Ouderaa et al.
The LLM Surgeon
https://arxiv.org/abs/2312.17244Deep Learning / LLMs
Frantar & Alistarh
SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot
https://arxiv.org/abs/2301.00774Deep Learning / LLMs
Zou et al.
Universal and Transferable Adversarial Attacks on Aligned Language Models
https://arxiv.org/pdf/2307.15043Discrete Optimization
Yongzheng Dai, Chen Chen
Serial and Parallel Two-Column Probing for Mixed-Integer Programming
https://arxiv.org/abs/2408.16927Graph Neural Networks
Ralph Abboud, Radoslav Dimitrov, Ismail Ilkan Ceylan
Shortest Path Networks for Graph Property Prediction
https://arxiv.org/pdf/2206.01003Scientific Machine Learning
Shaowu Pan, Steven L. Brunton, J. Nathan Kutz
Neural Implicit Flow: a mesh-agnostic dimensionality reduction paradigm of spatio-temporal data
https://arxiv.org/abs/2204.03216Scientific Machine Learning
Sifan Wang, Jacob H Seidman, Shyam Sankaran, Hanwen Wang, George J Pappas, Paris Perdikaris
Bridging Operator Learning and Conditioned Neural Fields: A Unifying Perspective
https://arxiv.org/pdf/2405.13998Deep Learning / Byzantine Federated Learning
Zeyuan Allen-Zhu, Faeze Ebrahimianghazani, Jerry Li, Dan Alistarh
Byzantine-Resilient Non-Convex Stochastic Gradient Descent
https://arxiv.org/abs/2012.14368Deep Learning / Byzantine Federated Learning
Sai Praneeth Karimireddy, Lie He, Martin Jaggi
Byzantine-Robust Learning on Heterogeneous Datasets via Bucketing
https://arxiv.org/abs/2006.09365Deep Learning / Federated Learning
Tao Lin, Lingjing Kong, Sebastian U. Stich, Martin Jaggi
Ensemble Distillation for Robust Model Fusion in Federated Learning
https://proceedings.neurips.cc/paper/2020/hash/18df51b97ccd68128e994804f3eccc87-Abstract.htmlMachine Learning
Xiaoqian Zhang, Maolin Luo, Zhaodi Wen, Qin Feng, Shengshi Pang, Weiqi Luo, Xiaoqi Zhou
Direct Fidelity Estimation of Quantum States Using Machine Learning
https://arxiv.org/abs/2102.02369Machine Learning+Algorithms
Siddhartha Banerjee, Vincent Cohen-Addad, Anupam Gupta, Zhouzi Li
Graph Searching with Predictions
https://arxiv.org/abs/2212.14220Deep Learning/LLMs
Jan Hendrik Kirchner, Yining Chen, Harri Edwards, Jan Leike, Nat McAleese, Yuri Bu
Prover-Verifier Games Improve Legibility of LLM Outputs
https://arxiv.org/pdf/2407.13692Statistical Learning Theory/ Bagging
Kaspar Green Larsen
Bagging is an Optimal PAC Learner
https://arxiv.org/pdf/2212.02264Statistical Learning Theory/ Boosting
Larsen and Ritzert
Optimal Weak to Strong Learning
https://arxiv.org/pdf/2206.01563Frank–Wolfe algorithms in machine learning
Francis Bach
On the Effectiveness of Richardson Extrapolation in Machine Learning
https://arxiv.org/abs/2002.02835Mixed-integer linear programming
Bjørnar Luteberget and Giorgio Sartor
Feasibility Jump: an LP-free Lagrangian MIP heuristic
https://link.springer.com/article/10.1007/s12532-023-00234-8