Discrete Optimization and Machine Learning

Sebastian Pokutta (SoSe 2021 / Seminar)

Both Machine Learning methods as well as Discrete Optimization methods are important tools in many real-world applications. In this course we will primarily study the interplay of integer programming and, more broadly, discrete optimization methods and machine learning.The format of the course is strongly research-oriented requiring active participation of the students. The course will be a mix of traditional lectures, student presentation, discussions, and project work. In the first few weeks in-class projects will be defined.

Prerequisites: Linear Algebra, Analysis, and Discrete Optimization (ADM I/ADM II)

Registration: via email to Antje Schulz

Office hours: by appointment

Organization & Requirements:

Students choose one of the papers below to work on. Up to 2 students can work on the same paper. Students are expected to individually

  • Write a report (5 page minimum in Latex) about the paper and the contribution from the students themselves.

  • During the semester, present their plans for the final report and presentation in a shorter 5 minute presentation.

  • Give a final presentation to present their findings (details discussed in the first meeting).


  • 21.04.2021, 10:00 a.m. (Berlin time): Seminar outline

  • 30.04.2021, 11:59 p.m. (Berlin time): Paper selection deadline (send your desired paper to Elias Wirth. Assignment is first come, first served. Papers that are no longer available are crossed out.)

  • 31.05.2021, 10:00 a.m. (Berlin time): 5 minute presentations

  • 04.07.2021, 11:59 p.m. (Berlin time): Report submission deadline (send your reports to Elias Wirth)

  • 05.07.2021, 08:00 a.m. (Berlin time): Final presentations

  • 06.07.2021, 08:00 a.m. (Berlin time): Final presentations

Reading material by topic (crossed out papers are no longer available):

Artificial Neural Networks

  1. Veličković, P., Cucurull, G., Casanova, A., Romero, A., Lio, P. and Bengio, Y., 2017. Graph attention networks. arXiv preprint arXiv:1710.10903. (arXiv version) (2/2)

  2. Arjovsky, M., Chintala, S. and Bottou, L., 2017, July. Wasserstein generative adversarial networks. In International conference on machine learning (pp. 214-223). PMLR. (arXiv version) (2/2)


  1. Peter Bartlett.Yoav Freund.Wee Sun Lee.Robert E. Schapire."Boosting the margin: a new explanation for the effectiveness of voting methods."Ann. Statist.26(5) 1651 - 1686, 1998. (link) (1/2)

Computer Vision

  1. R. Alaifari, G. S. Alberti, T. Gauksson. “ADef: an Iterative Algorithm to Construct Adversarial Deformations.” International Conference on Learning Representations. 2019. (arXiv version) (2/2)

Convex Optimization

  1. Allen-Zhu, Zeyuan and Orecchia, Lorenzo. "Nearly linear-time packing and covering LP solvers" Mathematical Programming Volume 175,307–353(2019) (arXiv version)

  2. Ho-Nguyen, Nam and Kilinc-Karzan, Fatma. "Primal-Dual Algorithms for Convex Optimization via Regret Minimization.” IEEE Control System Letters. 2018. (link) (2/2)

  3. Ding, L., Fan, J. and Udell, M., 2020. $ k $ FW: A Frank-Wolfe style algorithm with stronger subproblem oracles. arXiv preprint arXiv:2006.16142. (arXiv version)

  4. D. Garber and N. Wolf, "Frank–Wolfe with a nearest extreme point oracle" arXiv preprint (arXiv version) (1/2)


  1. Such et al. "Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data" Proceedings of the37thInternational Conference on MachineLearning, Vienna, Austria, PMLR 119, 2020. (link) (2/2)

Online Learning

  1. Rakhlin, A. and Sridharan, K., 2013. Optimization, learning, and games with predictable sequences. arXiv preprint arXiv:1311.1869. (arXiv version) (2/2)

Parallel Numerical Linear Algebra

  1. J. L. Greathouse and M. Daga, "Efficient Sparse Matrix-Vector Multiplication on GPUs Using the CSR Storage Format," SC '14: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, New Orleans, LA, USA, 2014, pp. 769-780, doi: 10.1109/SC.2014.68. (link) (1/2)

Spare Regression

  1. “Least Angle Regression”. Bradley Efron, Trevor Hastie, Iain Johnstone and Robert Tibshirani. The Annals of Statistics (2004). (arXiv version) (2/2)