Lecturers
Lecturers
Lénaïc Chizat
Institute of Mathematics, Dynamics Of Learning Algorithms Chair, EPFL
lenaics.chizat@epfl.ch
Dynamics of Neural Networks in the Large Width and Depth Asymptotics
Recent progress in AI has been fueled by training neural networks of ever-increasing size, a difficult engineering task still largely guided by heuristics. A promising approach to bring theoretical principles into this discipline is to analyze neural networks in their large size asymptotics. While the large-width limit is now relatively well understood and has already yielded practical tools adopted in industry (such as μP), the extension of the theory to large-depth models has remained elusive.
In these lectures, I will present recent analyses that rigorously capture the joint infinite-width and infinite-depth limit for architectures used in practice, such as the Transformer. Our approach combines techniques from stochastic approximation, propagation of chaos, and dynamical mean-field theory and leads to phase diagrams, quantitative error bounds, and new conceptual insights. For instance, we will see that models of infinite-depth, regardless of their actual width, behave throughout training as though they were also infinitely wide. We will also see that Transformers with P parameters and with optimal shape converge to their limit at a rate P^{-1/6}.
Nicolás García-Trillos
Department of Statistics, University of Wisconsin Madison
garciatrillo@wisc.edu
On Learning, robustness, and the geometry of noise
Suppose you are given a set of data points that are sampled from an unknown distribution in a set of reasonable probabilistic models. How do you construct an estimator, i.e. a function of your observations, for a given quantity of interest in an “optimal” way? This is a question that statisticians have asked for a long time and that in contemporary data sciences has become much more relevant as new notions of optimality beyond accuracy such as robustness, fairness, and privacy have gained relevance. In this lecture series, I will discuss this question, revisiting some of the classical notions of optimality of estimators and then focusing on the robustness of estimators to data perturbations. In contrast to a more standard statistical course on the topic, in this lecture series the emphasis will be on the geometry of statistical inference. This perspective will lead us to study different geometries in the space of probability measures and use their structures to link them to different noise models and to the corresponding notions of robustness of estimators to those noise models.
Guido Montúfar
UCLA Mathematics and Statistics & Data Science, Los Angeles
Mathematical Machine Learning Group, Max Planck Institute for Mathematics in the Sciences, Leipzig
montufar@math.ucla.edu
Geometric and Structural Perspectives on Learning and Verification in Neural Networks
This lecture series explores modern theoretical foundations of deep learning through the lenses of geometry, optimization dynamics, representation learning, and verification. We focus on geometric constraints induced by network architecture and parameterization, and highlight how data and learning algorithms jointly shape learning outcomes. Beginning with the polyhedral geometry induced by piecewise-linear activations, we analyze how neural networks partition input and parameter spaces, and how depth and architecture determine the resulting function classes. We then study the algorithmic bias of gradient-based optimization methods, emphasizing parameterization effects and non-asymptotic dynamical phenomena. Building on these perspectives, we examine mechanisms of feature learning in neural networks. The series concludes with algebraic-geometric approaches to neural network verification, connecting implicit representations and polynomial optimization to formal guarantees of robustness and extrapolation arising from structural and algebraic constraints.
Enrique Zuazua
Department of Mathematics & FAU Center for Mathematics of Data | MoD
Friedrich-Alexander-Universität Erlangen-Nürnberg -- Alexander von Humboldt Professorship
enrique.zuazua@fau.de
Machine Learning from an Applied Mathematician’s Perspective
Machine Learning has emerged as one of the most transformative forces in contemporary science and technology. In this three-lecture series, I will discuss Machine Learning through the lens of applied mathematics, emphasizing its connections with control theory, partial differential equations, and numerical analysis.
In the first lecture, we will revisit the historical and conceptual links between Machine Learning and Systems Control (Cybernetics). This point of view allows us to reinterpret representation and expressivity properties of deep neural networks in terms of ensemble or simultaneous controllability of neural differential equations.
The second lecture will focus on the use of neural network architectures as numerical approximation tools. We will consider, as a guiding example, the classical Dirichlet problem for the Laplace equation, formulated via energy minimization under neural-network constraints. Particular attention will be paid to the lack of convexity and coercivity in the resulting optimization problems. We will show how relaxation techniques may restore convexity at the price of losing coercivity, and we will discuss the mathematical implications of this trade-off for analysis and computation.
In the third lecture, we will present a PDE-based perspective on generative diffusion models. Their convergence can be reinterpreted in terms of the asymptotic behavior of Fokker–Planck equations driven by the so-called score vector field. We will explain how classical tools, such as Li-Yau-type differential inequalities for positive solutions of the heat equation, provide insight into the regularization and convergence properties of these models.
The series will conclude with a discussion of open problems and promising directions for future research at the interface of control theory, PDEs, numerical analysis, and modern Machine Learning.