Many scientific problems require a complicated probability measure µ on some space X to be summarised by a mode, a single point of "maximum probability under µ": in Bayesian inference, these are maximum a posteriori estimators; in molecular dynamics, they are minimum action paths. Particularly for large-scale problems in which the space X may be high- or infinite-dimensional or have other complex structure, there are many interesting mathematical questions and computational challenges associated with such problems. This webpage supports a loose ongoing series of online seminar talks about the mathematical analysis and computation of modes.
Speaker: T. J. Sullivan (University of Warwick)
Title: An Order-Theoretic Perspective on Modes and Maximum-a-Posteriori Estimation
Date and Time: 6 November 2024 at 15:00 GMT on MS Teams [link]
Abstract: It is often desirable to summarise a probability measure on a space X in terms of a mode, or MAP estimator, i.e. a point of maximum probability. Such points can be rigorously defined using masses of metric balls in the small-radius limit. However, the theory is not entirely straightforward: the literature contains multiple notions of mode and various examples of pathological measures that have no mode in any sense. Since the masses of balls induce natural orderings on the points of X, this article aims to shed light on some of the problems in non-parametric MAP estimation by taking an order-theoretic perspective, which appears to be a new one in the inverse problems community. This point of view opens up attractive proof strategies based upon the Cantor and Kuratowski intersection theorems; it also reveals that many of the pathologies arise from the distinction between greatest and maximal elements of an order, and from the existence of incomparable elements of X, which we show can be dense in X, even for an absolutely continuous measure on X=R.
Joint work with Hefin Lambley.
Speaker: Zachary Selk (Queens University)
Title: The Small Noise Limit of the Most Likely Element is the Most Likely Element in the Small Noise Limit
Date and Time: 10 September 2024 at 15:00 BST on MS Teams [link]
Abstract: In this talk, we discuss the Onsager–Machlup function and its relationship to the Freidlin–Wentzell function for measures equivalent to arbitrary infinite dimensional Gaussian measures. The Onsager–Machlup function can serve as a density on infinite dimensional spaces, where a uniform measure does not exist, and has been seen as the Lagrangian for the "most likely element". The Freidlin–Wentzell rate function is the large deviations rate function for small-noise limits and has also been identified as a Lagrangian for the "most likely element". This leads to a conundrum: what is the relationship between these two functions?
We show both pointwise and Г-convergence (which is essentially the convergence of minimizers) of the Onsager–Machlup function under the small-noise limit to the Freidlin–Wentzell function — and give an expression for both. That is, we show that the small-noise limit of the most likely element is the most likely element in the small noise limit for infinite dimensional measures that are equivalent to a Gaussian. Examples of measures include the law of solutions to path-dependent stochastic differential equations and the law of an infinite system of random algebraic equations.
Joint work with Harsha Honnappa.
Speaker: Yury Korolev (Bath)
Title: Recent developments in convex variational regularisation
Date and Time: 4 June 2024 at 15:00 BST on MS Teams
Slides: [Link]
Abstract: This seminar series is about connections between modes of a distribution and variational regularisation problems and I thought it may be useful to give a brief overview of some of the developments in (convex) variational regularisation over the last couple of decades. This includes variational source conditions, the use of generalised Bregman distances for analysing convergence of the minimisers, sparsity regularisation, and infimal convolution regularisation. I am not aware of a probabilistic interpretation of at least some of these concepts and perhaps this talk would stimulate a discussion about this.
Speaker: Ilja Klebanov (Freie Universität Berlin)
Title: On definitions of modes and MAP estimators
Date and Time: 2 April 2024 at 15:00 BST on MS Teams
Slides: [Link]
Abstract: While modes of a probability measure (and thereby maximum a posteriori estimators in the context of Bayesian posteriors) with a continuous Lebesgue density are easy to define, their definition in arbitrary metric spaces, in particular infinite-dimensional Banach and Hilbert spaces, is far from unambiguous. Several definitions, based on "small ball probabilities", have been suggested in the recent years - strong, weak and generalized modes - and many other meaningful alternatives are possible. In fact, even for (discontinuous) Lebesgue densities in one dimension, many connections between these notions are open problems and there are a lot of interesting questions, as well as some answers, which I will present using several insightful examples.
Speaker: Zachary Selk (Queens University)
Title: Information Projections on Gaussian Banach Spaces — How I discovered infinite dimensional modes
Date and Time: 6 February 2024 at 15:00 GMT on MS Teams
Abstract: My introduction to modes came from an information projection problem. Given a separable Banach space B, a centered Gaussian measure \mu_0 on B and another measure \mu^\ast equivalent to \mu_0, I was interested in approximating \mu^\ast by a measure \mu^z where \mu^z is \mu_0 with mean z. The approximation sense is KL divergence or relative entropy. The KL divergence is a divergence and not a metric, so it poses an interesting optimization problem. One of the crucial differences of KL divergence with a metric is the lack of symmetry - that is D_{KL}(\mu_1||\mu_2) is not in general equal to D_{KL}(\mu_2||\mu_1). Minimizing in the first component is called an information projection while minimizing in the second is called a moment projection. Moment projections are typically said to be "moment seeking" and information projections are said to be "mode seeking" and this is the case here.
We show that this information projection problem is equivalent to an "open loop" or state independent control problem, which is in turn equivalent to finding the mode of a related measure \tilde \mu. There are several open questions such as: 1. What is the relationship between the original \mu^\ast and \tilde \mu? 2. How do you numerically find modes on path spaces? 3. The mean shift measures serve as the ``extreme points" of measures equivalent to \mu_0 - can we prove an approximation result similar to Krein–Milman type results? This paper led to a few other papers and an appreciation for modes.
Joint with William Haskell and Harsha Honnappa. https://link.springer.com/article/10.1007/s00245-021-09786-4
Speaker: Hefin Lambley (University of Warwick)
Title: "Strong maximum a posteriori estimation in Bayesian inverse problems"
Date and Time: 5 December 2023 at 15:00 GMT on MS Teams
Slides: [Link]
Abstract: In the Bayesian approach to inverse problems (Stuart, 2010), the task of inferring the unknown quantity of interest (e.g. the initial condition to a PDE) from observations is formulated directly in function space. Designing methods that are well-posed at the continuum level leads to algorithms which do not degrade as the resolution is refined.
In contrast to classical approaches such as Tikhonov regularisation, the solution under the Bayesian approach is the posterior distribution on function space. The two approaches are connected by the fact that maximum a posteriori (MAP) estimators of the Bayesian inverse problem correspond to minimisers of a Tikhonov functional, but making this connection rigorous is challenging as even the definition of a MAP estimator is unclear in this setting.
The first work in this area by Dashti et al. (2013) defined a notion of MAP estimator in the continuum setting and proved that, when the prior is Gaussian and the parameter space is Hilbert, MAP estimators coincide with minimisers of a Tikhonov functional. Since then, much work has been done to extend this result to more general forward operators and parameter spaces.
In this talk, I will give an introduction to the theory of nonparametric MAP estimation, motivated by the application to Bayesian inverse problems. I will also discuss my recent results appearing in Inverse Problems which prove the connection between MAP estimators and classical solutions for a large class of Bayesian inverse problems with Gaussian priors.
A. M. Stuart (2010). Inverse problems: a Bayesian perspective. Acta Numer. 19:451–559. doi:10.1017/S0962492910000061
M. Dashti, K. J. H. Law, A. M. Stuart, J. Voss (2013). MAP estimators and their consistency in Bayesian nonparametric inverse problems. Inverse Probl. 29(9):095017. doi:10.1088/0266-5611/29/9/095017
H. Lambley (2023). Strong maximum a posteriori estimation in Banach spaces with Gaussian priors. Inverse Probl. 39(12):125010. doi:10.1088/1361-6420/ad07a4