All About that...

Seminar Series

A periodic seminar

All About that Series is a periodic seminar with a focus on compational statistics and machine learning methods for problems mostly in the Bayesian paradigm. It delves into solutions to challenging learning problems that can arise from a wide range of applications such as ecology, population genetics, signal processing, energy optimisation, just to name a few. All about that seminar is part of the activities conducted by the specialised group "Statistique Bayésienne" of the SFdS.

Current organiser : Kaniav Kamary

Previous organiser : from 2019 to 2025 it was organised by Sylvain Le Corff (LPSM, Sorbonne Université) and Julien Stoehr (Université Paris-Dauphine).

If you want to get details on upcoming talks, sign in to the newsletter here!

Access to seminar:

Pierre Grisvard (e.g. room 314), INSTITUT HENRI POINCARÉ

A periodic seminar

2025

Challenges in High Dimensional Bayesian Modelling. November 7, 2025. 14:00 - 17:00, IHP, Pierre Grisvard (room 314)

Stochastic Optimization. April 18, 2025. 14:00 - 16:00, Campus Pierre et Marie Curie (Sorbonne Université), SCAI

2025

Challenges in High Dimensional Bayesian Modelling. November 7, 2025. 14:00 - 17:00, IHP, Pierre Grisvard (room 314)

The seminar is open to everyone, but please confirm your attendance by registering at this link

Julyan Arbel (Inria, Université Grenoble Alpes) - Bayesian deep learning, overview and challenges

Abstract: Bayesian deep learning is appealing as it combines the coherence and natural uncertainty quantification of the Bayesian paradigm together with the expressivity and compositional flexibility of deep neural networks. It has its roots in pioneering work by Radford Neal and David Mackay in the 1990s on Bayesian neural networks. Its strengths lie in principled uncertainty quantification, improved data efficiency, and adaptability, making it impactful in safety-critical fields like healthcare and autonomous systems. In this talk I will provide an overview of Bayesian deep learning and discuss some of the key challenges the field faces in addressing modern machine learning problems.

Marion Naveau (Institut Agro Rennes-Angers) - High-dimensional variable selection in non-linear mixed effects models. Application in plant breeding

Abstract: The problem of variable selection in high-dimensional context, where the number of covariates exceeds the number of observations, is well studied in the context of standard regression models. However, few tools are currently available to address this issue for nonlinear mixed-effects models, where data are collected repeatedly across multiple individuals. My thesis focused on developing a high-dimensional variable selection procedure for these models, examining both its practical implementation and theoretical properties. This method is based on a Gaussian spike-and-slab prior and the SAEM algorithm (Stochastic Approximation of the Expectation-Maximization Algorithm). Its utility is illustrated through an application aimed at identifying genetic markers potentially involved in the senescence process of winter wheat. Furthermore, metamodeling approaches are being developed to reduce computation time when the regression function is costly to evaluate.

Nicolas Chopin (ENSAE, Institut Polytechnique de Paris) - Saddlepoint Monte Carlo and its Application to Exact Ecological Inference

Abstract: Assuming X is a random vector and A a non-invertible matrix, one sometimes need to perform inference while only having access to samples of Y = AX. The corresponding likelihood is typically intractable. One may still be able to perform exact Bayesian inference using a pseudo-marginal sampler, but this requires an unbiased estimator of the intractable likelihood. We propose saddlepoint Monte Carlo, a method for obtaining an unbiased estimate of the density of Y with very low variance, for any model belonging to an exponential family. Our method relies on importance sampling of the characteristic function, with insights brought by the standard saddlepoint approximation scheme with exponential tilting. We show that saddlepoint Monte Carlo makes it possible to perform exact inference on particularly challenging problems and datasets. We focus on the ecological inference problem, where one observes only aggregates at a fine level. We present in particular a study of the carryover of votes between the two rounds of various French elections, using the finest available data (number of votes for each candidate in about 60,000 polling stations over most of the French territory). We show that existing, popular approximate methods for ecological inference can lead to substantial bias, which saddlepoint Monte Carlo is immune from. We also present original results for the 2024 legislative elections on political centre-to-left and left-to-centre conversion rates when the far-right is present in the second round. Finally, we discuss other exciting applications for saddlepoint Monte Carlo, such as dealing with aggregate data in privacy or inverse problems.

Stochastic Optimization. April 18, 2025. 14:00 - 16:00, Campus Pierre et Marie Curie (Sorbonne Université), SCAI

The afternoon session is open to everyone, but please confirm your participation by registering via the following link: https://forms.office.com/e/APAHDfyYfQ

Clément Bonet (ENSAE) - Mirror and Preconditioned Gradient Descent in Wasserstein Space

Abstract: As the problem of minimizing functionals on the Wasserstein space encompasses many applications in machine learning, different optimization algorithms on Rd have received their counterpart analog on the Wasserstein space. We focus here on lifting two explicit algorithms: mirror descent and preconditioned gradient descent. These algorithms have been introduced to better capture the geometry of the function to minimize and are provably convergent under appropriate (namely relative) smoothness and convexity conditions. Adapting these notions to the Wasserstein space, we prove guarantees of convergence of some Wasserstein-gradient-based discrete-time schemes for new pairings of objective functionals and regularizers. The difficulty here is to carefully select along which curves the functionals should be smooth and convex. We illustrate the advantages of adapting the geometry induced by the regularizer on ill-conditioned optimization tasks, and showcase the improvement of choosing different discrepancies and geometries in a computational biology task of aligning single-cells.

Antoine Godichon Baggioni (Sorbonne Université) - Stochastic Newton algorithms with O(Nd) operations

Abstract: The majority of machine learning methods can be regarded as the minimization of an unavailable risk function. To optimize this function using samples provided in an online fashion, stochastic gradient descent is a common tool. However, it can be highly sensitive to ill-conditioned problems. To address this issue, we focus on Stochastic Newton methods. We first examine a version based on the Ricatti (or Sherman-Morrison) formula, which allows recursive estimation of the inverse Hessian with reduced computational time. Specifically, we show that this method leads to asymptotically efficient estimates and requires$O(Nd^2)$ operations (where N is the sample size and d is the dimension). Finally, we explore how to adapt the Stochastic Newton algorithm for a streaming context, where data arrives in blocks, and demonstrate that this approach can reduce the computational requirement to $ O(Nd) $ operations.

Simulation based inference. January 24, 2025. 13:30 - 17:00, Campus Pierre et Marie Curie (Sorbonne Université), SCAI

The afternoon session is open to everyone, but please confirm your participation by registering via the following link: https://forms.office.com/e/me7PJmkQDm

Joshua Bon (Université Paris Dauphine) - Bayesian score calibration for approximate models

Abstract: Scientists continue to develop increasingly complex mechanistic models to reflect their knowledge more realistically. Statistical inference using these models can be challenging since the corresponding likelihood function is often intractable and model simulation may be computationally burdensome. Fortunately, in many of these situations, it is possible to adopt a surrogate model or approximate likelihood function. It may be convenient to conduct Bayesian inference directly with the surrogate, but this can result in bias and poor uncertainty quantification. In this paper (https://arxiv.org/abs/2211.05357) we propose a new method for adjusting approximate posterior samples to reduce bias and produce more accurate uncertainty quantification. We do this by optimizing a transform of the approximate posterior that maximizes a scoring rule. Our approach requires only a (fixed) small number of complex model simulations and is numerically stable. We demonstrate beneficial corrections to several approximate posteriors using our method on several examples of increasing complexity.

Giacomo Zanella (Bocconi University) - Entropy contraction of the Gibbs sampler under log-concavity

Abstract: In this talk I will present recent work (https://arxiv.org/abs/2410.00858) on the non-asymptotic analysis of the Gibbs sampler, a classical and popular MCMC algorithm for sampling. In particular, under the assumption that the probability measure π of interest is strongly log-concave, we show that the random scan Gibbs sampler contracts in relative entropy, and provide a sharp characterization of the associated contraction rate. The result implies that, under appropriate conditions, the number of full evaluations of π required for the Gibbs sampler to converge is independent of the dimension. If time permits, I will also discuss connections and applications of the above results to the problem of zero-order parallel sampling, as well as extensions to Hit-and-Run and Metropolis-within-Gibbs.

Based on joint work with Filippo Ascolani and Hugo Lavenant.

Paul Bastide (Université Paris Cité) - Goodness of Fit for Bayesian Generative Models with Applications in Population Genetics

Abstract: In population genetics, inference about intractable likelihood models is common, and simulation methods, including Approximate Bayesian Computation (ABC) and Simulation-Based Inference (SBI), are essential. ABC/SBI methods work by simulating instrumental data sets of the models under study and comparing them with the observed data set $y_{obs}$. Advanced machine learning tools are used for tasks such as model selection and parameter inference. The present work focuses on model criticism. This type of analysis, called goodness of fit (GoF), is important for model validation. It can also be used for model pruning when the number of candidates to be considered is excessive, especially in the context where data simulation is expensive. We introduce two new GoF tests based on the local outlier factor (LOF), an indicator that was initially defined for outlier and novelty detection. We test whether $y_{obs}$ is distributed from the prior predictive distribution (pre-inference GoF) and whether there is a parameter value such that $y_{obs}$ is distributed from the likelihood with that value (post-inference GoF). We evaluate the performance of our two GoF tests on simulated datasets from three different model settings of varying complexity, and on a dataset of single nucleotide polymorphism (SNP) markers for the evaluation of complex evolutionary scenarios of modern human populations.

Joint work with Guillaume Le Mailloux, Jean-Michel Marin and Arnaud Estoup.

2024

Generative models & Inverse problems. November 15, 2024. 13:30 - 17:00, Campus Pierre et Marie Curie (Sorbonne Université), SCAI

The afternoon session is open to everyone, but please confirm your participation by registering via the following link: https://forms.office.com/e/fuVzYurNRY

Stanislas Strasman (Sorbonne Université) - An analysis of the noise schedule for score-based generative models

Abstract: Score-based generative models (SGMs) aim at estimating a target data distribution by learning score functions using only noise-perturbed samples from the target. Recent literature has focused extensively on assessing the error between the target and estimated distributions, gauging the generative quality through the Kullback-Leibler (KL) divergence and Wasserstein distances. Under mild assumptions on the data distribution, we establish an upper bound for the KL divergence between the target and the estimated distributions, explicitly depending on any time-dependent noise schedule. Under additional regularity assumptions, taking advantage of favorable underlying contraction mechanisms, we provide a tighter error bound in Wasserstein distance compared to state-of-the-art results. In addition to being tractable, this upper bound jointly incorporates properties of the target distribution and SGM hyperparameters that need to be tuned during training.

Geneviève Robin (Owkin) - Generative methods for sampling transition paths in molecular dynamics

Abstract: Molecular systems often remain trapped for long times around some local minimum of the potential energy function, before switching to another one -- a behavior known as metastability. Simulating transition paths linking one metastable state to another one is difficult by direct numerical methods. In view of the promises of machine learning techniques, we explore in this work two approaches to more efficiently generate transition paths: sampling methods based on generative models such as variational autoencoders, and importance sampling methods based on reinforcement learning.

Gabriel Victorino Cardoso (École des Mines) - Solving inverse problems with score-based priors

Abstract: Solving ill-posed (Bayesian) inverse problems generally rely on the power of the prior distribution (or data fidelity term). In this talk, we focus on how to use an off the shelf score-based generative model as prior and how to modify the inner sampling procedure of the generative model to sample (approximately) from the posterior distribution. This is done without retraining the off the shelf generative model. We then present how we have use this procedure to solve inverse problems that arise in electrocardiogram analysis.

References:

[1] Gabriel Cardoso, Yazid Janati, Sylvain Le Corff, and Eric Moulines. Monte Carlo guided Denoising Diffusion models for Bayesian linear inverse problems. The Twelfth International Conference on Learning Representations. 2023.

[2]Cardoso, G. V., Bedin, L., Duchateau, J., Dubois, R., & Moulines, E. (2023). Bayesian ecg reconstruction using denoising diffusion generative models. to appear in Neurips 2024.

April 24, 2024. 16:00, PariSanté Campus, Room 8

Guanyang Wang (Rutgers University) - MCMC when you do not want to evaluate the target distribution

Abstract: In sampling tasks, it is common for target distributions to be known up to a normalizing constant. However, in many situations, evaluating even the unnormalized distribution can be costly or infeasible. This issue arises in scenarios such as sampling from the Bayesian posterior for large datasets and the 'doubly intractable' distributions. We provide a way to unify various MCMC algorithms, including several minibatch MCMC algorithms and the exchange algorithm. This framework not only simplifies the theoretical analysis of existing algorithms but also creates new algorithms. Similar frameworks exist in the literature, but they concentrate on different objectives.

March 27, 2024. 14:00, Université Paris Dauphine, Salle B bis

François Caron (University of Oxford) - Deep Neural Networks with Dependent Weights: Gaussian Process Mixture Limit, Heavy Tails, Sparsity and Compressibility

Abstract: This article studies the infinite-width limit of deep feedforward neural networks whose weights are dependent, and modelled via a mixture of Gaussian distributions. Each hidden node of the network is assigned a nonnegative random variable that controls the variance of the outgoing weights of that node. We make minimal assumptions on these per-node random variables: they are iid and their sum, in each layer, converges to some finite random variable in the infinite-width limit. Under this model, we show that each layer of the infinite-width neural network can be characterised by two simple quantities: a non-negative scalar parameter and a Lévy measure on the positive reals. If the scalar parameters are strictly positive and the Lévy measures are trivial at all hidden layers, then one recovers the classical Gaussian process (GP) limit, obtained with iid Gaussian weights. More interestingly, if the Lévy measure of at least one layer is non-trivial, we obtain a mixture of Gaussian processes (MoGP) in the large-width limit. The behaviour of the neural network in this regime is very different from the GP regime. One obtains correlated outputs, with non-Gaussian distributions, possibly with heavy tails. Additionally, we show that, in this regime, the weights are compressible, and some nodes have asymptotically non-negligible contributions, therefore representing important hidden features. Many sparsity-promoting neural network models can be recast as special cases of our approach, and we discuss their infinite-width limits; we also present an asymptotic analysis of the pruning error. We illustrate some of the benefits of the MoGP regime over the GP regime in terms of representation learning and compressibility on simulated, MNIST and Fashion MNIST datasets.

February 13, 2024. 16:00, Campus Pierre et Marie Curie (Sorbonne Université), SCAI

Elisabeth Gassiat (Université Paris-Saclay) - A stroll through hidden Markov models

Abstract: Hidden Markov models are latent variables models producing dependent sequences. I will survey recent results providing guarantees for their use in various fields such as clustering, multiple testing, nonlinear ICA or variational autoencoders.

January 23, 2024. 16:00, PariSanté Campus

Ritabrata Dutta (University of Warwick) - Bayesian Model Averaging with exact inference of likelihood- free Scoring Rule Posteriors.

Abstract: A novel application of Bayesian Model Averaging to generative models parameterized with neural networks (GNN) characterized by intractable likelihoods is presented. We leverage a likelihood-free generalized Bayesian inference approach with Scoring Rules. To tackle the challenge of model selection in neural networks, we adopt a continuous shrinkage prior, specifically the horseshoe prior. We introduce an innovative blocked sampling scheme, offering compatibility with both the Boomerang Sampler (a type of piecewise deterministic Markov process sampler) for exact but slower inference and with Stochastic Gradient Langevin Dynamics (SGLD) for faster yet biased posterior inference. This approach serves as a versatile tool bridging the gap between intractable likelihoods and robust Bayesian model selection within the generative modelling framework.

2023

December 12, 2023. 16:00, Campus Pierre et Marie Curie (Sorbonne Université), salle 15-16 201

Sylvain Le Corff (Sorbonne Université) - Monte Carlo guided Diffusion for Bayesian linear inverse problems

Joint work with G. Cardoso, Y. Janati, E. Moulines.

Abstract: Ill-posed linear inverse problems that combine knowledge of the forward measurement model with prior models arise frequently in various applications, from computational photography to medical imaging. Recent research has focused on solving these problems with score-based generative models (SGMs) that produce perceptually plausible images, especially in inpainting problems. In this study, we exploit the particular structure of the prior defined in the SGM to formulate recovery in a Bayesian framework as a Feynman--Kac model adapted from the forward diffusion model used to construct score-based diffusion. To solve this Feynman--Kac problem, we propose the use of Sequential Monte Carlo methods. The proposed algorithm, MCGdiff, is shown to be theoretically grounded and we provide numerical simulations showing that it outperforms competing baselines when dealing with ill-posed inverse problems.

October 10, 2023. 16:00, Campus Pierre et Marie Curie (Sorbonne Université), SCAI

Kaniav Kamary (Centrale Supélec) - Bayesian principal component analysis

The technique of principal component analysis (PCA) has recently been expressed as the maximum likelihood solution for a generative latent variable model. In this talk, I’ll first present probabilistic reformulation that is the basis for a Bayesian treatment of PCA. Then, my focus will be on showing that the effective dimensionality of the latent space (equivalent to the number of retained principal components) can be determined automatically as part of the Bayesian inference procedure.

Webpage: https://sites.google.com/site/kaniavkamary/

May 09, 2023. 14:00, INRAE - BioSP (Avignon)

Meïli Baragtti (ENSAE) - Promenade en statistique bayésienne: une méthode d'élicitation, une méthode sans vraisemblance et un exemple simple dans un cadre de modèle épidémiologique de transmission de maladie

Webpage: http://www.meilibaragatti.fr

March 21, 2023. 14:00, AgroParisTech (22 place de l'Agronomie, 91123 Palaiseau), Amphi. A.0.04 (rez de chaussée du bâtiment d'accueil)

Francesca Crucinio (ENSAE) - Optimal Scaling Results for a Wide Class of Proximal MALA Algorithms

We consider a recently proposed class of MCMC methods which uses proximity maps instead of gradients to build proposal mechanisms which can be employed for both differentiable and non-differentiable targets. These methods have been shown to be stable for a wide class of targets, making them a valuable alternative to Metropolis-adjusted Langevin algorithms (MALA); and have found wide application in imaging contexts. The wider stability properties are obtained by building the Moreau-Yoshida envelope for the target of interest, which depends on a parameter $\lambda$. In this work, we investigate the optimal scaling problem for this class of algorithms, which encompasses MALA, and provide practical guidelines for the implementation of these methods.

Joint work with Alain Durmus, Pablo Jiménez, Gareth O. Roberts.

February 14, 2023. 14:00, Campus Pierre et Marie Curie (Sorbonne Université), Room 15.16-309

Adrian Raftery (University of Washington) - Very Long-Term Bayesian Global Population and Migration Projections for Assessing the Social Cost of Carbon

Population forecasts are used by governments and the private sector for planning, with horizons up to about three generations (around 2100) for different purposes. The traditional methods are deterministic using scenarios, but probabilistic forecasts are desired to get an idea of accuracy, to assess changes, and to make decisions involving risks. In a major breakthrough, since 2015 the United Nations has issued probabilistic population forecasts for all countries using a Bayesian methodology. Assessment of the social cost of carbon relies on long-term forecasts of carbon emissions, which in turn rely on even longer-range population and economic forecasts, to 2300. We extend the UN method to very-long range population forecasts, by combining the statistical approach with expert review and elicitation. We find that, while world population is projected to grow for most of the rest of this century, it is likely to stabilize in the 22nd century, and to decline in the 23rd century.

Webpage: https://sites.stat.washington.edu/raftery/

January 10, 2023. 14:00, Campus Pierre et Marie Curie (Sorbonne Université), Room 15.16-309

Daniele Durante (Bocconi University) - Detective Bayes: Bayesian nonparametric stochastic block modeling of criminal networks

Europol recently defined criminal networks as a modern version of the Hydra mythological creature, with covert structure and multifaceted evolutions. Indeed, relationships data among criminals are subject to measurement errors, structured missingness patterns, and exhibit a complex combination of an unknown number of core-periphery, assortative and disassortative structures that may encode key architectures of the criminal organization. The coexistence of these noisy block patterns limits the reliability of community detection algorithms routinely-used in criminology, thereby leading to overly-simplified and possibly biased reconstructions of organized crime topologies. In this seminar, I will present a number of model-based solutions which aim at covering these gaps via a combination of stochastic block models and priors for random partitions arising from Bayesian nonparametrics. These include Gibbs-type priors, and random partition priors driven by the urn scheme of a hierarchical normalized completely random measure. Product-partition models to incorporate criminals' attributes, and zero-inflated Poisson representations accounting for weighted edges and secrecy strategies, will be also discussed. Collapsed Gibbs samplers for posterior computation are presented, and refined strategies for estimation, prediction, uncertainty quantification and model selection will be outlined. Results are illustrated in an application to an Italian Mafia network, where the proposed models unveil a structure of the criminal organization mostly hidden to state-of-the-art alternatives routinely used in criminology. I will conclude the seminar with ideas on how to learn the evolutionary history of the criminal organization from the relationship data among its criminals via a novel combination of latent space models for network data and phylogenetic trees.

Webpage: https://danieledurante.github.io/web/

2022

December 13, 2022. 14:30, Campus Pierre et Marie Curie (Sorbonne Université), Room 16.26-113

Marylou Gabrié (Ecole Polytechnique) - Opportunities and Challenges in Enhancing Sampling with Learning

Deep generative models parametrize very flexible families of distributions able to fit complicated datasets of images or text. Virtually, these models provide independent samples from complex high-distributions at negligible costs. On the other hand, sampling exactly a target distribution, such a Bayesian posterior, is typically challenging: either because of dimensionality, multi-modality, ill-conditioning or a combination of the previous. In this talk, I will review recent works trying to enhance traditional inference and sampling algorithms with learning. I will present in particular flowMC, an adaptive MCMC with Normalizing Flow along with first applications and remaining challenges.

Webpage: https://marylou-gabrie.github.io/

November 8, 2022. 14:00, INRIA Grenoble (Mirror Session).

Filippo Ascolani (Bocconi University) - Clustering consistency with Dirichlet process mixtures

Dirichlet process mixtures are flexible non-parametric models, particularly suited to density estimation and probabilistic clustering. In this work we study the posterior distribution induced by Dirichlet process mixtures as the sample size increases, and more specifically focus on consistency for the unknown number of clusters when the observed data are generated from a finite mixture. Crucially, we consider the situation where a prior is placed on the concentration parameter of the underlying Dirichlet process. Previous findings in the literature suggest that Dirichlet process mixtures are typically not consistent for the number of clusters if the concentration parameter is held fixed and data come from a finite mixture. Here we show that consistency for the number of clusters can be achieved if the concentration parameter is adapted in a fully Bayesian way, as commonly done in practice. Our results are derived for data coming from a class of finite mixtures, with mild assumptions on the prior for the concentration parameter and for a variety of choices of likelihood kernels for the mixture.

Joint work with Antonio Lijoi, Giovanni Rebaudo, and Giacomo Zanelli.

Reference: https://arxiv.org/abs/2205.12924 (Biometrika, forthcoming)

Webpage: https://filippoascolani.github.io/

October 11, 2022. 14:00, Campus Pierre et Marie Curie (Sorbonne Université), Room 15-16-201.

Andrew Gelman (Columbia University) - Prior distribution for causal inference

In Bayesian inference, we must specify a model for the data (a likelihood) and a model for parameters (a prior). Consider two questions:

Why is it more complicated to specify the likelihood than the prior?
In order to specify the prior, how could can we switch between the theoretical literature (invariance, normality assumption, ...) and the applied literature (experts elicitation, robustness, ...)?

I will discuss those question in the domain of causal inference: prior distributions for causal effects, coefficients of regression and the other parameters in causal models.

Webpage: http://www.stat.columbia.edu/~gelman/

March 15, 2022. 16:00, AgroParisTech, Amphitheatre Dumont

Alexandre Bouchard-Côté (University of British Columbia) - Approximation of intractable integrals using non-reversibility and non-linear distribution paths

In the first part of the talk, I will present an adaptive, non-reversible Parallel Tempering (PT) allowing MCMC exploration of challenging problems such as single cell phylogenetic trees. A sharp divide emerges in the behaviour and performance of reversible versus non-reversible PT schemes: the performance of the former eventually collapses as the number of parallel cores used increases whereas non-reversible benefits from arbitrarily many available parallel cores. These theoretical results are exploited to develop an adaptive scheme to efficiently optimize over annealing schedules.

In the second half, I will talk about the global communication barrier, a fundamental limit shared by both reversible and non-reversible PT methods, and on our recent work that leverage non-linear annealing paths to provably and practically break that barrier.

My group is also interested in making these advanced non-reversible Monte Carlo methods easily available to data scientists. To do so, we have designed a Bayesian modelling language to perform inference over arbitrary data types using non-reversible, highly parallel algorithms.

References:

Non-Reversible Parallel Tempering: a Scalable Highly Parallel MCMC Scheme (2021). S. Syed, A. Bouchard-Côté, G. Deligiannidis, A. Doucet. Journal of Royal Statistical Society, Series B. https://rss.onlinelibrary.wiley.com/doi/10.1111/rssb.12464
Parallel Tempering on Optimized Paths (2021). S. Syed, V. Romaniello, T. Campbell, A. Bouchard-Côté. International Conference on Machine Learning (ICML). http://proceedings.mlr.press/v139/syed21a/syed21a.pdf
Software: Blang: Probabilitistic Programming for Combinatorial Spaces. A. Bouchard-Côté, K. Chern, D. Cubranic, S. Hosseini, J. Hume, M. Lepur, Z. Ouyang, G. Sgarbi. Journal of Statistical Software (Accepted). https://arxiv.org/abs/1912.10396, https://www.stat.ubc.ca/~bouchard/blang/

Webpage: https://www.stat.ubc.ca/~bouchard/index.html

February 22, 2022. 16:00, AgroParisTech, Amphitheatre Coléou

Arnaud Guyader (LPSM, Sorbonne Université) - On the Asymptotic Normality of Adaptive Multilevel Splitting

Adaptive Multilevel Splitting (AMS) is a Sequential Monte Carlo method for Markov processes that simulates rare events and estimates associated probabilities. Despite its practical efficiency, there are almost no theoretical results on the convergence of this algorithm. The purpose of this talk is to prove both consistency and asymptotic normality results in a general setting. This is done by associating to the original Markov process a level-indexed process, also called a stochastic wave, and by showing that AMS can then be seen as a Fleming-Viot type particle system. This is a joint work with Frédéric Cérou, Bernard Delyon, and Mathias Rousset.

Webpage: https://www.lpsm.paris/pageperso/guyader/index.html

2021

March 16th, 2021. 16:00, Building D'Alembert, room Condorcet, ENS Paris-Saclay

Estelle Kuhn (INRAE, Unité MaIAGE) - Properties of the stochastic approximation EM algorithm with mini-batch sampling

To deal with very large datasets a mini-batch version of the Monte Carlo Markov Chain Stochastic Approximation Expectation– Maximization algorithm for general latent variable models is proposed. For exponential models the algorithm is shown to be convergent under classical conditions as the number of iterations increases. Numerical experiments illustrate the performance of the mini-batch algorithm in various models. In particular, we highlight that mini-batch sampling results in an important speed-up of the convergence of the sequence of estimators generated by the algorithm. Moreover, insights on the effect of the mini-batch size on the limit distribution are presented. Finally, we illustrate how to use mini-batch sampling in practice to improve results when a constraint on the computing time is given.

Reference: Journal version, ArXiv version

Webpage: http://genome.jouy.inra.fr/~ekuhn/

2020

March 13th, 2020. 13:30, Building D'Alembert, room Condorcet, ENS Paris-Saclay

Julyan Arbel (INRIA Grenoble) - Understanding Priors in Bayesian Neural Networks at the Unit Level

We investigate deep Bayesian neural networks with Gaussian weight priors and a class of ReLU-like nonlinearities. Bayesian neural networks with Gaussian priors are well known to induce an L2, “weight decay”, regularization. Our results characterize a more intricate regularization effect at the level of the unit activations. Our main result establishes that the induced prior distribution on the units before and after activation becomes increasingly heavy-tailed with the depth of the layer. We show that first layer units are Gaussian, second layer units are sub-exponential, and units in deeper layers are characterized by sub-Weibull distributions. Our results provide new theoretical insight on deep Bayesian neural networks, which we corroborate with simulation experiments.

Webpage: https://www.julyanarbel.com/

February 26th, 2020. 16:00, room 32

Pierre E. Jacob (Harvard University) - Unbiased MCMC with couplings

MCMC methods yield estimators that converge to integrals of interest in the limit of the number of iterations. This iterative asymptotic justification is not ideal; first, it stands at odds with current trends in computing hardware, with increasingly parallel architectures; secondly, the choice of "burn-in" or "warm-up" is arduous. This talk will describe recently proposed estimators that are unbiased for the expectations of interest while having a finite computing cost and a finite variance. They can thus be generated independently in parallel and averaged over. The method also provides practical upper bounds on the distance (e.g. total variation) between the marginal distribution of the chain at a finite step and its invariant distribution. The key idea is to generate "faithful" couplings of Markov chains, whereby pairs of chains coalesce after a random number of iterations. This talk will provide an overview of this line of research.

Reference: https://arxiv.org/abs/1708.03625. Code in R available at: https://github.com/pierrejacob/unbiasedmcmc.

Webpage: https://sites.google.com/site/pierrejacob/

January 21st, 2020. 15:00, room 42

Scott Sisson (UNSW) - Approximate posteriors and data for Bayesian inference

For various reasons, including large datasets and complex models, approximate inference is becoming increasingly common. In this talk I'll provide three vignettes of recent work. These cover

approximate Bayesian computation for Gaussian process density estimation
likelihood-free Gibbs sampling
MCMC for approximate (rounded) data.

Webpage: https://web.maths.unsw.edu.au/~scott/Welcome.html

2019

November 12th, 2019. 15:00, room 39C (Aile Arbalete, 2e floor)

François Portier (Télécom Paris) - On adaptive importance sampling: theory and methods

Adaptive importance sampling (AIS) uses past samples to update the sampling policy qt at each stage t. Each stage t is formed with two steps:

to explore the space with nt points according to qt ;
to exploit the current amount of information to update the sampling policy.

In this talk, I will present different AIS methods and show that they are optimal in some sense.

Webpage: https://sites.google.com/site/fportierwebpage/

October 15, 2019. 15:30, room 39C (Aile Arbalete, 2e floor)

Grégoire Clarté - Component-wise approximate Bayesian computation via Gibbs-like steps

Approximate Bayesian computation methods are useful for generative models with intractable likelihoods. These methods are however sensitive to the dimension of the parameter space, requiring exponentially increasing resources as this dimension grows. To tackle this difficulty, we explore a Gibbs version of the ABC approach that runs component-wise approximate Bayesian computation steps aimed at the corresponding conditional posterior distributions, and based on summary statistics of reduced dimensions. While lacking the standard justifications for the Gibbs sampler, the resulting Markov chain is shown to converge in distribution under some partial independence conditions. The associated stationary distribution can further be shown to be close to the true posterior distribution and some hierarchical versions of the proposed mechanism enjoy a closed form limiting distribution. Experiments also demonstrate the gain in efficiency brought by the Gibbs version over the standard solution.

Reference: arxiv.org/abs/1905.13599

Page updated

Google Sites

Report abuse