Discrete Optimization and Machine Learning

Sebastian Pokutta (WiSe 2023 / Seminar)

Both Machine Learning methods as well as Discrete Optimization methods are important tools in many real-world applications. In this course we will primarily study the interplay of integer programming and, more broadly, discrete optimization methods and machine learning.
The format of the course is strongly research-oriented requiring active participation of the students. The course will be a mix of student presentation, discussions, and project work. In the first few weeks in-class projects will be defined.

Course organization

Prerequisites: Linear Algebra, Analysis, and Discrete Optimization (ADM I/ADM II)

Registration: Together with paper selection after first meeting (seminar outline)

Participation requirements:
Students are expected to individually

Write a report (5 page minimum in Latex) about the paper and the contribution from the students themselves.
Send your reports to Jannis Halbey
During the semester, present their plans for the final report and presentation in a shorter 5-minute presentation.
Give a final presentation to present their findings (details discussed in the first meeting).

Paper/project selection:
Students choose one of the papers below to work on

Up to 2 students can work on the same paper
Assignment is first come, first served
Send paper choice to Jannis Halbey
For fairness reasons we only accept selections after 19.10.2023 04:00 pm

Contribution:
Every student is expected to make a contribution to the given topic and not just reproduce the results. The aim is not to obtain groundbreaking new results, but to get an impression of scientific work. Furthermore, you also need to have a very sound understanding of the subject in order to contribute something yourself. Contributions can be an extension of the original algorithm, a new theoretical proof or a comparison with new research.

Examples for strong contributions:

COIL: A Deep Architecture for Column Generation

- Identify the OptNet-Layer as one main bottleneck
- Examine Lagrangian Dual Framework from Ferdinando Fioretto et al.: Lagrangian Duality for Constrained Deep Learning as alternative solution
- Implement modification and evaluate empirically

Sparse Adversarial and Interpretable Attack Framework

- Identify vanilla Frank-Wolfe Algorithm as potential point for improvement
- Examine the variant Blended Pairwise Conditional Gradient
- Implement modification and evaluate empirically

PEP: Parameter Ensembles by Perturbation

- Comparison with state-of-the-art Deep Ensembles
- Examine advantages and disadvantages of both methods
- Implement combination of both approaches and evaluate empirically

Timeline:
All meetings take place in the MAR building in Marchstraße 23.

18.10.2023, 04:00 pm in MAR 0.017: Seminar outline
06.12.2023, 04:00 pm in MAR 0.017: 5-minute presentations
31.01.2024, 11:59 pm : Report submission deadline
01.02.2024, 10:00 am in MAR 0.010: Final presentations
02.02.2024, 10:00 am in MAR 0.002: Final presentations

Reading material by topic (red papers are no longer available):

Optimization
Francesco Orabona: Normalized Gradients for All
https://arxiv.org/pdf/2308.05621.pdf
Mixed-Integer Convex Optimization

Simge Küçükyavuz, Ali Shojaie†, Hasan Manzour, Linchuan Wei, Hao-Hsiang Wu : Consistent Second-Order Conic Integer Programming for Learning Bayesian Networks

https://arxiv.org/abs/2005.14346

Frank–Wolfe algorithms

Dan Garber, Atara Kaplan : Efficiency of First-Order Methods for Low-Rank Tensor Recovery with the Tensor Nuclear Norm Under Strict

Complementarity

https://arxiv.org/pdf/2308.01677.pdf

Frank–Wolfe algorithms

Francesco Locatello, Rajiv Khanna, Michael Tschannen, Martin Jaggi : Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, PMLR 54:860-868, 2017.

http://proceedings.mlr.press/v54/locatello17a.html

Optimization and Machine Learning
Bolte, J., Le, T., and Silveti-Falls, T.: Nonsmooth Implicit Differentiation for Machine-Learning and Optimization.
https://proceedings.neurips.cc/paper/2021/hash/70afbf2259b4449d8ae1429e054df1b1-Abstract.html
Optimization and Machine Learning

Goerigk, M., Hartisch, M. : A Framework for Inherently Interpretable Optimization Models

https://arxiv.org/pdf/2208.12570.pdf

Deep Learning
Yamamura et al. : Diversified Adversarial Attacks based on Conjugate Gradient Method

https://arxiv.org/pdf/2206.09628.pdf

Deep Learning

Tim Dettmers, Artidoro Pagnoni, Ari Holtzman, Luke Zettlemoyer : QLoRA: Efficient Finetuning of Quantized LLMs

https://arxiv.org/pdf/2305.14314.pdf

Deep Learning

Wang, Zhu, Torralba, Efros : Dataset Distillation

https://arxiv.org/abs/1811.10959

Deep Learning / Formal proof verification

Wu et al. : Autoformalization with Large Language Models

https://proceedings.neurips.cc/paper_files/paper/2022/file/d0c6bc641a56bebee9d985b937307367-Paper-Conference.pdf

Physics-informed Deep Learning

Raissi, M., Perdikaris, P. and Karniadakis, G. E. : Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations
https://www.sciencedirect.com/science/article/pii/S0021999118307125

Federated Larning/ Deep learning/ Optimization

Zeyuan Allen-Zhu, Faeze Ebrahimian, Jerry Li, Dan Alistarh : Byzantine-Resilient Non-Convex Stochastic Gradient Descent

https://arxiv.org/abs/2012.14368

Machine Learning / Interpretability

Marco Tulio Ribeiro, Sameer Singh, Carlos Guestrin : "Why should I trust you?": Explaining the predictions of any classifier

https://arxiv.org/abs/1602.04938

Quantum-based simulation of dynamical systems

Stengl, M., Gelß, P., Klus, S., Pokutta, S. : Existence and Uniqueness of Solutions of the Koopman-von Neumann Equation on Bounded Domains

https://arxiv.org/pdf/2306.13504