Monte Carlo Online Seminar

Fall 2025

Youtube Playlist

Tuesday, September 23, 2025
- Speaker: Alexandre Bouchard-Côté (University of British Columbia) [Zoom Link]
- Title: Computational Lebesgue integration
- Abstract: In many modern applications in science and engineering, we seek to reconstruct a complicated object x from noisy data y, for example, one may seek to reconstruct an evolutionary tree from sequencing data. In principle, Bayesian statistics provides a broad framework to approach such problems, by modelling knowns and unknowns as random variables X and Y. Since the notion of a posterior distribution, X|Y, is defined under very general conditions, Bayesian inference is in a sense universal for the purpose of data analysis. In contrast, other inferential setups often require, among other things, for x to be real-valued in order to use approximations such as those based on the central limit theorem.

However, this generality hinges on being able to approximate expectations with respect to an arbitrary measure. Can we develop generic sampling methods in such an unstructured context? Surprisingly, practical methodologies are indeed possible. I will describe some of our work in the area with a focus on recent developments based on regenerative MCMC, particle methods, and non-reversibility. My group is also working on making these complex Monte Carlo methods easy to use: check out https://pigeons.run/dev/ , a package that allows users to leverage clusters of 1000s of nodes to speed up difficult Monte Carlo problems without requiring knowledge of distributed algorithms.

- Links: Webpage Slides YouTube

Tuesday, September 30, 2025
- Speaker: Yian Ma (University of California San Diego) [Zoom Link]
- Title: MCMC, variational inference, and reverse diffusion Monte Carlo
- Abstract: I will introduce some recent progress towards understanding the scalability of Markov chain Monte Carlo (MCMC) methods and their comparative advantage with respect to variational inference. I will fact-check the folklore that "variational inference is fast but biased, MCMC is unbiased but slow". I will then discuss a combination of the two via reverse diffusion, which holds promise of solving some of the multi-modal problems. This talk will be motivated by the need for Bayesian computation in reinforcement learning problems as well as the differential privacy requirements that we face.
- Links: Slides YouTube

Tuesday, October 7, 2025
- Speaker: Michael Albergo (Harvard) [Zoom Link]
- Title: Non-equilibrium transport and tilt matching for sampling
- Abstract: We propose a simple, scalable algorithm for using stochastic interpolants to perform sampling from unnormalized densities and for fine-tuning generative models. The approach, Tilt Matching, arises from a dynamical equation relating the velocity field for a flow matching method to the velocity field that would target the same distribution tilted by a reward. As such, the new velocity inherits the regularity of stochastic interpolant transport plans while also being the minimizer of an objective function with strictly lower variance than flow matching itself. The update to the velocity field that emerges from this simple regression problem can be interpreted as the sum of all joint cumulants of the stochastic interpolant and copies of the reward, and to first order is their covariance. We define two versions of the method, Explicit and Implicit Tilt Matching. The algorithms do not require any access to gradients of the reward or backpropagating through trajectories of the flow or diffusion. We empirically verify that the approach is efficient, unbiased, and highly scalable, providing state-of-the-art results on sampling under Lennard-Jones potentials and is competitive on fine-tuning Stable Diffusion, without requiring reward multipliers. It can also be straightforwardly applied to tilting few-step flow map models.
- Links: YouTube

Tuesday, October 14, 2025
- Speaker: Ruqi Zhang (Purdue University) [Zoom Link]
- Title: Gradient-Based Discrete Sampling: Algorithms and Applications
- Abstract: Sampling from discrete distributions is a core challenge in machine learning and statistics. In this talk, I will present a line of work on gradient-based discrete sampling that extends the idea of Langevin dynamics to discrete spaces. I will begin with the first discrete Langevin algorithm, which provides a principled way to leverage gradient information for sampling in discrete domains. Building on this, I will discuss a series of extensions: cyclical step-size schedules for multimodal distributions, reheating mechanisms for combinatorial optimization, and Newton’s series approximation for non-differentiable functions. In the final part, I will showcase applications of these methods, from classic models like Ising models and energy-based models, to modern challenges like large language models and drug molecular design. Together, these advances demonstrate how gradient-based methods can make discrete sampling more scalable, efficient, and impactful for real-world problems.

Tuesday, October 21, 2025
- Speaker: Alex Shestopaloff (Queen Mary University of London) [Zoom Link]
- Title: A unifying framework for generalised Bayesian online learning in non-stationary environments
- Abstract: I will talk about some of our work on methods for probabilistic online learning in non-stationary environments. In particular, I will talk about the framework BONE, which stands for generalised (B)ayesian (O)nline learning in (N)on-stationary (E)nvironments. BONE provides a common structure to tackle a variety of problems, including online continual learning, prequential forecasting, and contextual bandits. The framework requires specifying three modelling choices: (i) a model for measurements (e.g., a neural network), (ii) an auxiliary process to model non-stationarity (e.g., the time since the last changepoint), and (iii) a conditional prior over model parameters (e.g., a multivariate Gaussian). The framework also requires two algorithmic choices, which we use to carry out approximate inference under this framework: (i) an algorithm to estimate beliefs (posterior distribution) about the model parameters given the auxiliary variable, and (ii) an algorithm to estimate beliefs about the auxiliary variable. We show how the modularity of our framework allows for many existing methods to be reinterpreted as instances of BONE, and it allows us to propose new methods. This is work with Gerardo Duran-Martin, Leandro Sanchez-Betancourt and Kevin P. Murphy.
- Links: YouTube

Tuesday, October 28, 2025
- Speaker: Giorgos Vasdekis (Newcastle University) [Zoom Link]
- Title: Sampling with time-changed Markov processes
- Different Time for US/Asia: [ 9:30 am PT ] = [ 12:30 pm ET ] = [ 4:30 pm London ] = [ 5:30 pm Paris ] = [ 0:30 am Beijing + 1d]
- Abstract: We introduce a framework of time-changed Markov processes to speed up the convergence of Markov chain Monte Carlo (MCMC) algorithms in the context of multimodal distributions and rare event simulation. The time-changed process is defined by adjusting the speed of time of a base process via a user-chosen, state-dependent function. We apply this framework to several Markov processes from the MCMC literature, such as Langevin diffusions and piecewise deterministic Markov processes, obtaining novel modifications of classical algorithms and also re-discovering known MCMC algorithms. We prove theoretical properties of the time-changed process under suitable conditions on the base process, focusing on connecting the stationary distributions and qualitative convergence properties such as geometric and uniform ergodicity, as well as a functional central limit theorem. Time permitting, we will compare our approach with the framework of space transformations, clarifying the similarities between the approaches. This is joint work with Andrea Bertazzi. The talk is based on the following preprint: https://arxiv.org/abs/2501.15155.
- Links: Paper

Tuesday, November 4, 2025
- Speaker: Filippo Ascolani (Duke University) [Zoom Link]
- Title: Coordinate-wise MCMC schemes for structured high-dimensional Bayesian models
- Abstract: Coordinate-wise MCMC schemes (e.g. Gibbs and Metropolis-within-Gibbs) are popular algorithms to sample from posterior distributions arising from Bayesian models. We discuss recent developments in the analysis of such algorithms in high-dimensional scenarios. In the first part of the talk we discuss hierarchical models with a large number of groups, obtaining dimension-free convergence results for coordinate-wise samplers under random data-generating assumptions, for a broad class of two-level models with generic likelihood function. Specific examples with Gaussian, binomial and categorical likelihoods are discussed. In the second part we show that if the target distribution is strongly log-concave then the random scan Gibbs sampler contracts in relative entropy and provide a sharp characterization of the associated contraction rate. Extension to Metropolis-within-Gibbs schemes and a comparison with gradient-based methods are also discussed. This is based on joint works with Giacomo Zanella (Bocconi University), Hugo Lavenant (Bocconi University) and Gareth Roberts (University of Warwick).
- Links: Youtube, Slides

Tuesday, November 11, 2025
- Speaker: Samuel Livingstone (University College London) [Zoom Link]
- Title: New preconditioning theory and methodology for MCMC sampling algorithms
- Abstract: I will discuss two pieces of work. In the first we present theoretical results related to preconditioning Markov chain Monte Carlo sampling algorithms. Preconditioning is a widely used technique that is known empirically to improve the mixing of Markov chain algorithms, but little has been said theoretically about it. I will present some recent work establishing some positive and negative theoretical results. I will then discuss some methodological work devising new preconditioners. The standard options to choose from in common software packages are 'diagonal' or 'dense'. We will present a new alternative option that seeks to improve upon diagonal preconditioning whilst also being less computationally expensive than the quadratic cost required for the dense option. Both projects are joint work with my former PhD student Max Hird, now a PDRA at University of Waterloo. The first is associated with this paper: https://www.jmlr.org/papers/v26/23-1633.html.
- Links: YouTube

Tuesday, November 18, 2025
- Speaker: Yixin Wang (University of Michigan) [Zoom Link]
- Title: Reuniting Bayesian Inference and Generative Modeling: A Two-Way Integration
- Abstract: Bayesian inference and generative modeling were historically intertwined, but modern developments have diverged into separate toolkits: Bayesian inference reasons about latent structures, while generative modeling emphasizes density estimation and high-fidelity sampling. This talk presents two recent works that reconnect these traditions in different directions, showing how each can strengthen the other. In the first part, I introduce Posterior Mean Matching (PMM), which derives generative modeling directly from online Bayesian inference. PMM translates conjugate Bayesian updates into iterative refinement rules for sampling, yielding a family of expressive generative models for real-valued, count, and discrete data. This perspective reveals how classical Bayesian machinery can underlie state-of-the-art generative processes across diverse data modalities. In the second part, I move in the opposite direction with Structured Flow Autoencoders (SFA), which use powerful generative flows to enhance Bayesian inference. SFA augments graphical models with conditional flow-based likelihoods and introduces a flow-matching objective that explicitly accounts for latent variables, enabling simultaneous learning of structured posteriors and high-quality generators. Applied to images, videos, and single-cell RNA-seq data, SFA delivers both stronger latent representations and higher quality samples. Together, these works show that combining Bayesian inference’s structural understanding with generative modeling’s expressive sampling can produce models that are more interpretable, more adaptable across data types, and capable of producing high-quality generations. This is joint work with Sebastian Salazar, Michal Kucer, Emily Casleton, David Blei, Yidan Xu, and Long Nguyen.
- Links: Slides