The One World Seminar Series on the Mathematics of Machine Learning is an online platform for research seminars, workshops and seasonal schools in theoretical machine learning. The focus of the series lies on theoretical advances in machine learning and deep learning. The series was started during the Covid-19 epidemic in 2020 to bring together researchers from all over the world for presentations and discussions in a virtual environment. It follows in the footsteps of other community projects under the One World Umbrella which originated around the same time.
We welcome suggestions for speakers concerning new and exciting developments and are committed to providing a platform also for junior researchers. We recognize the advantages that online seminars provide in terms of flexibility. Any feedback on different events is welcome.
Zoom talks are held on Wednesdays at 12:00 pm New York time (9:00pm Pacific).
A list of past seminars can be found here and recordings can be viewed on our Youtube channel. The invitation to future seminars will be shared on this site before the talk and distributed via email.
Wed 25 Mar
Giulio Biroli
Why Diffusion Models Don't Memorize: The Role of Implicit Dynamical Regularization in Training
Abstract: Diffusion models now underpin many of the most powerful generative systems, yet understanding the mechanisms that prevent their memorization of training data and allow generalization remains a key challenge. In this talk, I will focus on the role of the training dynamics in the transition from generalization to memorization. I will identify two sharply separated timescales in training. The first, τ_gen, marks the onset of high-quality sample generation and is largely insensitive to dataset size. The second, τ_mem, signals the emergence of memorization and grows linearly with the number of training examples. As a result, increasing the dataset size creates an expanding window of training times during which models generalize effectively—even though the same models will exhibit strong memorization if optimized for longer. Only when the dataset exceeds a model-dependent threshold does memorisation vanish altogether in the infinite-time limit. These findings reveal a form of implicit dynamical regularization: the training dynamics themselves protect against memorization over a substantial and controllable regime, despite extreme overparameterization. I will support these conclusions with numerical experiments using standard U-Net architectures on both synthetic and realistic datasets, and with a complementary theoretical analysis of a tractable random-features model in the high-dimensional limit. Together, the results offer a unifying framework for understanding how dataset size and training time jointly govern generalization in modern diffusion models. This talk is based on a joint work presented at NeurIPS 2025 with T. Bonnaire, R. Urfin and M. Mézard.
Wed 22 April
Tizian Wenzel
On the optimal shape parameter for kernel methods and beyond
Abstract: The search for the optimal shape parameter for Radial Basis Function (RBF) kernel methods has been an outstanding research problem for decades. In this work, we establish a theoretical framework for this problem by leveraging a recently established theory on sharp direct, inverse and saturation statements for kernel based approximation. In particular, we link the search for the optimal shape parameter to superconvergence phenomena.
The analysis is carried out for finitely smooth Sobolev kernels, thereby covering large classes of radial kernels used in practice, including those emerging from current machine-learning methodologies.The results elucidate how approximation regimes, kernel regularity, and parameter choices interact, thereby clarifying a question that has remained unresolved for decades.
Sign up here to join our mailing list and receive announcements. If your browser automatically signs you into a google account, it may be easiest to join on a university account by going through an incognito window. With other concerns, please reach out to one of the organizers.
Ricardo Baptista (University of Toronto)
Wuyang Chen (Simon Fraser University)
Bin Dong (Peking University)
Lyudmila Grigoryeva (University of St. Gallen)
Boumediene Hamzi (Caltech)
Yuka Hashimoto (NTT)
Qianxiao Li (National University of Singapore)
Lizao Li (Google)
George Stepaniants (Caltech)
Zhiqin Xu (Shanghai Jiao Tong University)
Simon Shaolei Du (University of Washington)
Franca Hoffmann (Caltech)
Surbhi Goel (Microsoft Research NY)
Issa Karambal (Quantum Leap Africa)
Tiffany Vlaar (University of Glasgow)
Chao Ma (Stanford University)
Song Mei (UC Berkeley)
Philipp Petersen (University of Vienna)
Matthew Thorpe (University of Warwick)
Stephan Wojtowytsch (University of Pittsburgh)