Machine Learning Theory
IMPA - agosto a novembro de 2023
News
Second homework assignment (due September 15th by email): All problems in section 4.9, and also problems 3.8 and 3.9, of the book by Mohri et al.
Here are some notes on concentration inequalities including McDiarmid's.
First homework assignment (due August 25th by email): Problems 3.4, 3.23, 3.16, 3.26 (a), 3.27, 3.31
News and reminders will be posted here
General information
Professor: Roberto Imbuzeiro Oliveira
Classes: Mondays and Wednesdays from 13:30 to 15:00 in room 347.
Course dynamics: This is an in-person class that is being taught in English. There will be no recording or livecasting of lectures. Students are encouraged to attend all lectures for the duration of the term
What is this? Who is the class for?
This is a PhD-level course on the theoretical aspects of Machine Learning methods. Students are expected to be comfortable with measure-theoretic Probability at the level of the Probabilidade 1 class at IMPA.
The following topics will be covered in more or less detail.
PAC learning.
VC dimension and Rademacher complexity.
Model selection and regularization
Support Vector Machines
Kernel methods and Mercer's Theorem
Boosting
Regression
Neural networks: fundamentals and contemporary theoretical challenges
Other topics to be determined by the instructor.
Important: this is a theory class. There will be no coding assignments or any related material. Indeed, my plan is to keep the computer in the lecture room off at all times.
Bibliography
Textbook
Mehryar Mohri, Afshin Rostamizadeh, Ameet Talwalkar. Foundations of Machine Learning (available online). MIT Press, Second Edition, 2018
Other references
Francis Bach. Learning Theory from First Principles. (To be published by Cambridge University Press.)
Moritz Hardt, Benjamin Recht. Patterns, predictions and actions: a story about machine learning. Princeton University Press (2022).
Martin Anthony, Peter Bartlett. Neural Network Learning: theoretical foundations. Cambridge University Press (1999).
Papers (to be posted)
Evaluation
Homework assignments every two weeks + final presentation.
Possible papers to present
Rules
If you have an IMPA emal account, you can add your name and choice of paper to the following spreadsheet.
Each student will make a final presentation lasting ~60 minutes .
The presentation must be about the contents of one of the papers below.
The presenter should make an effort to clarify the main mathematical statements being presented. (There is one exception where the presentation should focus on experiments.)
Random Features for Kernel Approximation: A Survey on Algorithms, Theory, and Beyond: arXiv:2004.11154
On the Multiple Descent of Minimum-Norm Interpolants and Restricted Lower Isometry of Kernels: arXiv:1908.10292
Just Interpolate: Kernel "Ridgeless" Regression Can Generalize: arXiv:1808.00387
On Equivalence of Martingale Tail Bounds and Deterministic Regret Inequalities
A unified framework for information-theoretic generalization bounds: arXiv:2305.11042
A Theory of Universal Learning: arXiv:2011.04483
Conformal PID Control for Time Series Prediction: arXiv:2307.16895
(tem de fazer experimentos) Reconciling modern machine-learning practice and the classical bias–variance trade-off: https://www.pnas.org/doi/10.1073/pnas.1903070116?doi=10.1073/pnas.1903070116
Benign Overfitting in Linear Regression: arXiv:1906.11300
Explore no more: Improved high-probability regret bounds for non-stochastic bandits: arXiv:1506.03271
Convergence rates for shallow neural networks learned by gradient descent: arXiv:2107.09550
A statistical analysis of an image classification problem: arXiv:2206.02151
Generalization error of random feature and kernel methods: Hypercontractivity and kernel matrix concentration: https://www.sciencedirect.com/science/article/pii/S1063520321001044
A Precise High-Dimensional Asymptotic Theory for Boosting and Minimum-... : arXiv:2002.01586