Spring 2026
Tuesday, January 13, 2026
Title: Stability, instability, and extension of variational inference
Abstract: Variational inference (VI) is a popular alternative to Markov chain Monte Carlo (MCMC) for approximating high-dimensional target distributions. At its core, VI approximates a high-dimensional target distribution—typically specified via an unnormalized density—by a simpler variational family. Despite its empirical successes, the theoretical properties of variational inference have only begun to be understood recently. In this talk, I will discuss recent developments in the theory of variational inference from an optimal transport perspective. In the first part, I will present our recent results on the stability and instability of mean-field variational inference (MFVI). Our main insight is simple: when the target distribution is strongly log-concave, MFVI is quantitatively stable under perturbations of the target, whereas even for simple non–log-concave targets such as a mixture of two Gaussians, MFVI provably suffers from mode collapse. The consequencs of our results are discussed, including guarantees for robust Bayesian inference and a quantitative Bernstein–von Mises theorem. In the second part of the talk, I will present our work on the statistical and computational theory for a class of structured variational inference where the variational family consists of all star-shaped distributions. We establish quantitative approximation guarantees and provide a polynomial-time algorithm for solving the VI problem when the target distribution is strongly log-concave. We also discuss concrete examples, including generalized linear models with Gaussian likelihoods. We also discuss concrete examples including generalized linear models with Gaussian likelihoods. This talk is based on joint work with Shunan Sheng, Alberto González-Sanz, Marcel Nutz, Sinho Chewi, Binghe Zhu, and Aram Pooladian.
Tuesday, January 20, 2026
Speaker: Jiequn Han (Flatiron Institute) [Zoom Link]
Title: Driftlite: Lightweight drift control for inference-time scaling of diffusion models
Abstract: We study inference-time scaling for diffusion models, where a pre-trained model is adapted to new target distributions without retraining. While guidance-based methods are simple but biased, particle-based approaches such as Sequential Monte Carlo often suffer from weight degeneracy and high computational cost. We introduce DriftLite, a lightweight, training-free particle-based method that steers inference dynamics on the fly with provably optimal stability control. DriftLite exploits a previously unexplored degree of freedom in the Fokker–Planck equation between the drift and the particle potential, leading to two practical schemes, Variance- and Energy-Controlling Guidance (VCG/ECG), which approximate the optimal drift with minimal overhead. Across Gaussian mixture models, interacting particle systems, and large-scale protein–ligand co-folding problems, DriftLite consistently reduces variance and improves sample quality compared to pure guidance and Sequential Monte Carlo baselines.
If time permits, I will also briefly introduce self-consistent stochastic interpolants, which enable generative modeling from indirect, noisy observations and substantially extend applicability to many scientific and engineering problems where clean data are unavailable.
Tuesday, January 27, 2026
Speaker: Manon Michel (CNRS, Université Clermont-Auvergne) [Zoom Link]
Title: Harnessing Newtonian dynamics in generative models
Abstract: Generative modeling seeks to learn and sample from complex, high-dimensional data distributions and plays a key role in machine learning, Bayesian inference, and computational physics. One powerful class of methods, called Normalizing Flows, works by gradually transforming simple random noise into complex data using reversible steps. While these models are reliable and mathematically well-understood, they can become slow and expensive to use when dealing with high dimensions. In this talk, I will explain how ideas from classical physics, in particular, the laws that govern motion, can be used to build more efficient and intuitive generative models. By designing these models to follow classical dynamics and using neural networks only where they are most helpful, we can further reduce computational costs while making the models more stable and easier to interpret.
Tuesday, February 3, 2026
Speaker: Ethan N. Epperly (UC Berkeley) [Zoom Link]
Title: What is the role of Monte Carlo in randomized linear algebra?
Abstract: The Monte Carlo and computational mathematics research communities have traditionally not had a lot of overlap. "Our experience suggests that many practitioners of scientific computing view randomized algorithms as a desperate and final resort," writes one influential paper. Yet the advent of the field of randomized linear algebra has created new opportunities for dialog between Monte Carlo researchers and computational mathematicians. This talk will provide an overview of ways Monte Carlo methodologies have been used to improve randomized linear algebraic algorithms, with a focus on the speaker's research. The talk will identify three distinct roles for Monte Carlo in randomized linear algebra: solving problems of intractable scale, reducing variance, and solving challenging sampling problems
Tuesday, February 10, 2026
Speaker: Aaron Smith (University of Ottawa) [Zoom Link]
Title: Some Speedups From Fast Convergence of Important Test Functions
Abstract: From classical spectral analysis of Markov chains on finite state spaces, we know that some non-Markovian functions f(X_{t}) can converge much more quickly (or in much stronger norm) than the full Markov chain X_{t}. Several authors have used this to speed up MCMC, obtaining large improvements in the "simple" case of multimodality induced by label-switching or more subtle improvements for statistically-relevant test functions (see Rabinovitch et al (2016)). Unfortunately, it is often difficult to get bounds that are both interpretable and offer large speedups. In this talk, I'll discuss some recent work in which we obtain quadratic advantages for two very different reasons in two very different settings: approximate MCMC for graphical models, and exact MCMC for all "relevant" functions of a random rotation matrix. While this work is largely mathematical, the talk will focus on examples and open questions over detailed proofs, with a collection of simple worked examples from the literature on function-specific mixing times and applications.
Based on joint work with Vishesh Jain, Na Lin, Yuanyuan Liu, Natesh Pillai, Ashwin Sah, Mehtaab Sawhney, and Vinod Vaikuntanathan.