Discrete Optimization and Machine Learning

Sebastian Pokutta (WiSe 2024 / Seminar)

Both Machine Learning methods as well as Discrete Optimization methods are important tools in many real-world applications. In this course we will primarily study the interplay of integer programming and, more broadly, discrete optimization methods and machine learning.
The format of the course is research-oriented requiring active participation of the students. The course will be a mix of student presentation, discussions, and project work. In the first few weeks in-class projects will be defined.

Course organization

Prerequisites: Linear Algebra, Analysis, and Discrete Optimization (ADM I/ADM II)

Registration: Together with paper selection after first meeting (seminar outline)

Participation requirements:
Students are expected to individually

Write a report (5 page minimum in Latex) about the paper and the contribution from the students themselves.
Send your reports to Jannis Halbey
During the semester, present their plans for the final report and presentation in a shorter 5-minute presentation.
Give a final presentation to present their findings (details discussed in the first meeting).

Paper/project selection:
Students choose one of the papers below to work on

Up to 2 students can work on the same paper
Assignment is first come, first served
Send paper choice to Jannis Halbey
For fairness reasons we only accept selections after 15.10.2024 15:00

Contribution:
Every student is expected to make a contribution to the given topic and not just reproduce the results. The aim is not to obtain groundbreaking new results, but to get an impression of scientific work. Furthermore, you also need to have a very sound understanding of the subject in order to contribute something yourself. Contributions can be an extension of the original algorithm, a new theoretical proof or a comparison with new research.

Examples for strong contributions:

COIL: A Deep Architecture for Column Generation

- Identify the OptNet-Layer as one main bottleneck
- Examine Lagrangian Dual Framework from Ferdinando Fioretto et al.: Lagrangian Duality for Constrained Deep Learning as alternative solution
- Implement modification and evaluate empirically

Sparse Adversarial and Interpretable Attack Framework

- Identify vanilla Frank-Wolfe Algorithm as potential point for improvement
- Examine the variant Blended Pairwise Conditional Gradient
- Implement modification and evaluate empirically

PEP: Parameter Ensembles by Perturbation

- Comparison with state-of-the-art Deep Ensembles
- Examine advantages and disadvantages of both methods
- Implement combination of both approaches and evaluate empirically

Timeline:

Seminar outline 14.10.2024, 14:15 (online)
Open paper selection 15.10.2024, 15:00 (mail to Jannis Halbey)
5-min presentations 28.11.2024, 10:15 in (EW 184)
Report submission deadline 09.02.2025, 23:59
Final presentations 10.02.2025, 10:15 in (Seminar room at ZIB)

Reading material by topic (red papers are no longer available):

Machine Learning
Asterios Tsiourvas, Wei Sun, Georgia Perakis
Manifold-Aligned Counterfactual Explanations for Neural Networks
https://proceedings.mlr.press/v238/tsiourvas24a.html
Deep Learning / LLMs
Yuxin Zhang et al.
Dynamic Sparse No Training: Training-Free Fine-tuning for Sparse LLMs
https://arxiv.org/abs/2310.08915
Deep Learning / LLMs
Van der Ouderaa et al.
The LLM Surgeon
https://arxiv.org/abs/2312.17244
Deep Learning / LLMs
Frantar & Alistarh
SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot
https://arxiv.org/abs/2301.00774
Deep Learning / LLMs
Zou et al.
Universal and Transferable Adversarial Attacks on Aligned Language Models
https://arxiv.org/pdf/2307.15043
Discrete Optimization
Yongzheng Dai, Chen Chen
Serial and Parallel Two-Column Probing for Mixed-Integer Programming
https://arxiv.org/abs/2408.16927
Graph Neural Networks
Ralph Abboud, Radoslav Dimitrov, Ismail Ilkan Ceylan
Shortest Path Networks for Graph Property Prediction
https://arxiv.org/pdf/2206.01003
Scientific Machine Learning
Shaowu Pan, Steven L. Brunton, J. Nathan Kutz
Neural Implicit Flow: a mesh-agnostic dimensionality reduction paradigm of spatio-temporal data
https://arxiv.org/abs/2204.03216
Scientific Machine Learning
Sifan Wang, Jacob H Seidman, Shyam Sankaran, Hanwen Wang, George J Pappas, Paris Perdikaris
Bridging Operator Learning and Conditioned Neural Fields: A Unifying Perspective
https://arxiv.org/pdf/2405.13998
Deep Learning / Byzantine Federated Learning
Zeyuan Allen-Zhu, Faeze Ebrahimianghazani, Jerry Li, Dan Alistarh
Byzantine-Resilient Non-Convex Stochastic Gradient Descent
https://arxiv.org/abs/2012.14368
Deep Learning / Byzantine Federated Learning
Sai Praneeth Karimireddy, Lie He, Martin Jaggi
Byzantine-Robust Learning on Heterogeneous Datasets via Bucketing
https://arxiv.org/abs/2006.09365
Deep Learning / Federated Learning
Tao Lin, Lingjing Kong, Sebastian U. Stich, Martin Jaggi
Ensemble Distillation for Robust Model Fusion in Federated Learning
https://proceedings.neurips.cc/paper/2020/hash/18df51b97ccd68128e994804f3eccc87-Abstract.html
Machine Learning
Xiaoqian Zhang, Maolin Luo, Zhaodi Wen, Qin Feng, Shengshi Pang, Weiqi Luo, Xiaoqi Zhou
Direct Fidelity Estimation of Quantum States Using Machine Learning
https://arxiv.org/abs/2102.02369
Machine Learning+Algorithms
Siddhartha Banerjee, Vincent Cohen-Addad, Anupam Gupta, Zhouzi Li
Graph Searching with Predictions
https://arxiv.org/abs/2212.14220
Deep Learning/LLMs
Jan Hendrik Kirchner, Yining Chen, Harri Edwards, Jan Leike, Nat McAleese, Yuri Bu
Prover-Verifier Games Improve Legibility of LLM Outputs
https://arxiv.org/pdf/2407.13692
Statistical Learning Theory/ Bagging
Kaspar Green Larsen
Bagging is an Optimal PAC Learner
https://arxiv.org/pdf/2212.02264
Statistical Learning Theory/ Boosting
Larsen and Ritzert
Optimal Weak to Strong Learning
https://arxiv.org/pdf/2206.01563
Frank–Wolfe algorithms in machine learning
Francis Bach
On the Effectiveness of Richardson Extrapolation in Machine Learning
https://arxiv.org/abs/2002.02835
Mixed-integer linear programming
Bjørnar Luteberget and Giorgio Sartor
Feasibility Jump: an LP-free Lagrangian MIP heuristic
https://link.springer.com/article/10.1007/s12532-023-00234-8
Mixed-integer linear programming
Matteo Fischetti and Domenico Salvagnin
Feasibility pump 2.0
https://link.springer.com/article/10.1007/s12532-009-0007-3
Mixed-integer nonlinear programming
Henri Lefebvre Enrico Malaguti Michele Monaci
Exact Approaches for Convex Adjustable Robust Optimization
https://optimization-online.org/2022/11/an-exact-approach-for-convex-adjustable-robust-optimization/
LLM and problem modelling
Jihai Zhang, Wei Wang, Siyan Guo, Li Wang, Fangquan Lin, Cheng Yang, Wotao Yin
Solving General Natural-Language-Description Optimization Problems with Large Language Models
https://arxiv.org/abs/2407.07924