Discrete Optimization and Machine Learning

Sebastian Pokutta (WiSe 2023 / Seminar)

Both Machine Learning methods as well as Discrete Optimization methods are important tools in many real-world applications. In this course we will primarily study the interplay of integer programming and, more broadly, discrete optimization methods and machine learning.
The format of the course is strongly research-oriented requiring active participation of the students. The course will be a mix of student presentation, discussions, and project work. In the first few weeks in-class projects will be defined. 

Course organization

Prerequisites: Linear Algebra, Analysis, and Discrete Optimization (ADM I/ADM II) 

Registration: Together with paper selection after first meeting (seminar outline)


Participation requirements:
Students are expected to individually 


Paper/project selection:
Students choose one of the papers below to work on


Contribution:
Every student is expected to make a contribution to the given topic and not just reproduce the results. The aim is not to obtain  groundbreaking new results, but to get an impression of scientific work. Furthermore, you also need to have a very sound understanding of the subject in order to contribute something yourself. Contributions can be an extension of the original algorithm, a new theoretical proof or a comparison with new research.

Examples for strong contributions:

COIL: A Deep Architecture for Column Generation

Sparse Adversarial and Interpretable Attack Framework

PEP: Parameter Ensembles by Perturbation


Timeline:
All meetings take place in the MAR building in Marchstraße 23.

Reading material by topic (red papers are no longer available):

Simge Küçükyavuz, Ali Shojaie†, Hasan Manzour, Linchuan Wei, Hao-Hsiang Wu : Consistent Second-Order Conic Integer Programming for Learning Bayesian Networks

https://arxiv.org/abs/2005.14346

Dan Garber, Atara Kaplan : Efficiency of First-Order Methods for Low-Rank Tensor Recovery with the Tensor Nuclear Norm Under Strict

Complementarity

https://arxiv.org/pdf/2308.01677.pdf

Francesco Locatello, Rajiv Khanna, Michael Tschannen, Martin Jaggi : Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, PMLR 54:860-868, 2017.

http://proceedings.mlr.press/v54/locatello17a.html

Goerigk, M., Hartisch, M. : A Framework for Inherently Interpretable Optimization Models

https://arxiv.org/pdf/2208.12570.pdf

https://arxiv.org/pdf/2206.09628.pdf

Tim Dettmers, Artidoro Pagnoni, Ari Holtzman, Luke Zettlemoyer : QLoRA: Efficient Finetuning of Quantized LLMs

https://arxiv.org/pdf/2305.14314.pdf

Wang, Zhu, Torralba, Efros : Dataset Distillation

https://arxiv.org/abs/1811.10959

Wu et al. : Autoformalization with Large Language Models

https://proceedings.neurips.cc/paper_files/paper/2022/file/d0c6bc641a56bebee9d985b937307367-Paper-Conference.pdf

Raissi, M., Perdikaris, P. and Karniadakis, G. E. : Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations
https://www.sciencedirect.com/science/article/pii/S0021999118307125

Zeyuan Allen-Zhu, Faeze Ebrahimian, Jerry Li, Dan Alistarh : Byzantine-Resilient Non-Convex Stochastic Gradient Descent

https://arxiv.org/abs/2012.14368

Marco Tulio Ribeiro, Sameer Singh, Carlos Guestrin : "Why should I trust you?": Explaining the predictions of any classifier

https://arxiv.org/abs/1602.04938

Stengl, M., Gelß, P., Klus, S., Pokutta, S. : Existence and Uniqueness of Solutions of the Koopman-von Neumann Equation on Bounded Domains

https://arxiv.org/pdf/2306.13504