Day 1: Aug 7, 2024
2:30 - 3:15 pm
Title: Recent Results on Mean Field Games, Optimal Transport, and In-Context Learning
Abstract: We have recently been developing algorithms related to mean field games, optimal transport, in-context learning, score based generative models and links between Laplace's method, the Moreau envelope and Hamilton-Jacobi equations. I will try to give a coherent talk including many of these.
3:15 - 4:00 pm
Title: Computational mean-field games: from conventional methods to deep generative models
Abstract: Mean-field games study the behavior of a large number of rational agents in a non-cooperative game. It has wide applications in various fields. But it is not easy to solve the mean-field game numerically because of its complicated structure.
In the first part of my talk, I will present an efficient and flexible algorithm for dynamic mean-field games. The algorithm is based on an accelerated proximal gradient method. It consists of an easy-to-implement gradient descent step and a projection step equivalent to solving an elliptic equation. We also extend the setting of mean-field games and the algorithm to manifolds. In the second part of my talk, I will bridge mean-field games with a deep generative model which is called normalizing flows. The connection gives a computational approach for high-dimensional mean-field games and improves the training of the generative model.
The first part is based on joint works with Rongjie Lai (Purdue), Wuchen Li (UofSC) and Stanley Osher (UCLA). The second part is based on a joint work with Han Huang (RPI), Rongjie Lai (Purdue) and Jie Chen (IBM).
4:00 - 4:30 pm Coffee Break
4:30 - 5:15 pm
Title: Drift Optimization and Denoising Score Matching for Score-based Diffusion Models
Abstract: Machine learning generative models are designed to learn data-generating distributions by transforming noise distributions. Score-based diffusion models, a significant approach in this domain, utilize the time-reversal of Markov diffusion processes to model this transformation. Anderson (1982) established a seminal result demonstrating the existence of such time-reversals, relying on the score function (or log-density) derived from the forward Markov diffusion process. Recent research has also explored Skorokhod reflected Markov diffusions and their time-reversals (studied by Williams (1989)) particularly for generating distributions that are compactly supported. The central objective in training score-based diffusion models, whether employing reflected diffusions or other variants, remains the accurate estimation of the score function. In this presentation, I will draw analogies between the denoising score-matching (DSM) training objectives for score-based diffusion models and the 'drift rate control' problems encountered in stochastic or queuing networks. Furthermore, I will demonstrate that drift rate control can be viewed as a specific instance within a broader class of 'drift optimization' problems, offering rigorous statistical and optimization guarantees. These findings will establish the theoretical foundations for assessing the performance of score-based diffusion models.
Day 2: Aug 8, 2024
9:00 - 9:45 am
Title: How to make rate estimates more accurate?
Abstract: Many processes in nature such as conformal changes in biomolecules and clusters of interacting particles, genetic switches, transitions in mechanical or electromechanical oscillators with added noise, and many others are modeled using stochastic differential equations with small noise. The study of rare transitions between metastable states in such systems is of great interest and importance. The direct simulation of rare transitions is difficult due to long waiting times and high dimensionality. In this talk, I will discuss how one can learn coarse-grained models and use optimal stochastic control to sample transition trajectories and obtain estimates for transition rates between the designated metastable states.
9:45 - 10:30 am
Abstract: Nonparametric density models are of great interest in various scientific and engineering disciplines. Classical density kernel methods, while numerically robust and statistically sound in low-dimensional settings, become inadequate even in moderate higher-dimensional settings due to the curse of dimensionality. In this talk, we will introduce a new framework called Variance-Reduced Sketching (VRS), specifically designed to estimate multivariable density functions with a reduced curse of dimensionality. Our framework conceptualizes multivariable functions as infinite size matrices, and facilitates a new sketching technique motivated by numerical linear algebra literature to reduce the variance in density estimation problems. We demonstrate the robust numerical performance of VRS through a series of simulated experiments and real-world data applications. Notably, VRS shows remarkable improvement over existing neural network estimators and classical kernel methods in numerous density models. Additionally, we offer theoretical justifications for VRS to support its ability to deliver nonparametric density estimation with a reduced curse of dimensionality.
10:30 - 11:00 am Coffee Break
11:00 - 11:45 am
Title: Learning-enhanced structure preserving particle methods for nonlinear PDEs
Abstract: In the current stage of numerical methods for PDE, the primary challenge lies in addressing the complexities of high dimensionality while maintaining physical fidelity in our solvers. In this presentation, I will introduce deep learning assisted particle methods aimed at mitigating some of these challenges. Two scenarios will be considered, one is for general nonlinear Wasserstein-type gradient flows, and the other is for the Landau equation in plasma physics.
11:45 - 12:30 pm
Title: Concentration and limit of large random matrices with given margins
Abstract: We study large random matrices with i.i.d. entries conditioned to have prescribed row and column sums (margin). This problem has rich connections to relative entropy minimization, Schrödinger bridge, the enumeration of contingency tables, and random graphs with given degree sequences. We show that such margin-constrained random matrix is sharply concentrated around a certain deterministic matrix, which we call the "typical table".Typical tables have dual characterizations: (1) the expectation of the random matrix ensemble with minimum relative entropy from the base model constrained to have the expected target margin, and (2) the expectation of the maximum likelihood model obtained by rank-one exponential tilting of the base model. The structure of the typical table is dictated by two dual variables, which give the maximum likelihood estimates of the tilting parameters. Based on these results, for a sequence of "tame" margins that converges in L^1 to a limiting continuum margin as the size of the matrix diverges, we show that the sequence of margin-constrained random matrices converges in cut norm to a limiting kernel, which is the L^2-limit of the corresponding rescaled typical tables. The rate of convergence is controlled by how fast the margins converge in L^1. We derive several new results for random contingency tables from our general framework. Based on a joint work with Sumit Mukherjee.
12:30 - 2:00 pm Lunch Break
2:00 - 2:45 pm
Title: Parameterized Wasserstein Geometric Flow
Abstract: We introduce a new parameterization strategy that can be used to design algorithms simulating geometric flows on Wasserstein manifold, the probability density space equipped with optimal transport metric. The framework leverages the theory of optimal transport and the techniques like the push-forward operators and neural networks, leading to a system of ODEs for the parameters of neural networks. The resulting methods are mesh-less, basis-less, sample-based schemes that scale well to higher dimensional problems. The strategy works for Wasserstein gradient flows such as Fokker-Planck equation, and Wasserstein Hamiltonian flow like Schrodinger equation.
Theoretical error bounds measured in Wasserstein metric is established. This presentation is based on joint work with Yijie Jin (Math, GT), Wuchen Li (South Carolina), Shu Liu (UCLA), Has Wu (Wells Fargo), Xiaojing Ye (Georgia State), and Hongyuan Zha (CUHK-SZ).
2:45 - 3:30 pm
Title: A tractable algorithm, based on optimal transport, for computing adversarial training lower bounds.
Abstract: Despite the success of deep learning-based algorithms, it is widely known that neural networks may fail to be robust to adversarial perturbations of data. In response to this, a popular paradigm that has been developed to enforce robustness of learning models is adversarial training (AT), but this paradigm introduces many computational and theoretical difficulties. Recent works have developed a connection between AT in the multiclass classification setting and multimarginal optimal transport (MOT), unlocking a new set of tools to study this problem. In this talk, I will leverage the MOT connection to discuss new computationally tractable numerical algorithms for computing universal lower bounds on the optimal adversarial risk. The key insight in the AT setting is that one can harmlessly truncate high order interactions between classes, preventing the combinatorial run times typically encountered in MOT problems. I’ll present a rigorous complexity analysis of the proposed algorithm and validate our theoretical results experimentally on the MNIST and CIFAR-10 datasets, demonstrating the tractability of our approach. This is joint work with Matt Jacobs (UCSB), Jakwang Kim (UBC), and Matt Werenski (Tufts).
3:30 - 4:00 pm Coffee Break and group photo
4:00 - 4:45 pm
Title: Diffusion Models: Theory and Applications (in PDEs)
Abstract: Diffusion models, particularly score-based generative models (SGMs), have emerged as powerful tools in diverse machine learning applications, spanning from computer vision to modern language processing. In the first part of this talk, we delve into the generalization theory of SGMs, exploring their capacity for learning high-dimensional distributions. Our analysis show that SGMs achieve a dimension-free generation error bound when applied to a class of sub-Gaussian distributions characterized by certain low-complexity structures. In the second part of the talk, we consider the application of diffusion models in solving partial differential equations (PDEs). Specifically, we present the development of a physics-guided diffusion model designed for reconstructing high-fidelity solutions from their low-fidelity counterparts. This application showcases the adaptability of diffusion models and their potential to scientific computation.
4:45 - 5:30 pm
Speaker 1: Dixi Wang, Title: Learning operators for identifying weak solutions to the Navier-Stokes equations
Abstract: We focus on investigating the learning operators for identifying weak solutions to the Navier-Stokes equations. Our objective is to establish a connection between the initial data as input and the weak solution as output. To achieve this, we employ a combination of deep learning methods and compactness argument to derive learning operators for weak solutions for any large initial data in 2D, and for low-dimensional initial data in 3D. Additionally, we utilize the universal approximation theorem to derive a lower bound on the number of sensors required to achieve accurate identification of weak solutions to the Navier-Stokes equations. Our results demonstrate the potential of using deep learning techniques to address challenges in the study of fluid mechanics, particularly in identifying weak solutions to the Navier-Stokes equations.
Speaker 2: Frank Cole, Title: On the generalization of diffusion models in high dimensions
Abstract: Diffusion models have achieved remarkable success in generating high-quality samples from complex data distributions. From a mathematical perspective, it has been unclear how diffusion models are able to approximate high-dimensional probability distributions to such accuracy. In this work, we identify a notion of 'low-complexity' for probability distributions and show that diffusion models can learn low-complexity distributions without the curse of dimension. Our notion of low-complexity captures several natural families of probability distributions, including Gaussian mixtures and even certain nonparametric distributions.
Speaker 3: Soham Sarkar, Title: Stability in a short pulse laser
Speaker 4: Yuxi Han, Title: Homogenization of state-constraint problems on perforated domains
Abstract: We present the rate of convergence for the periodic homogenization problem of state-constraint Hamilton–Jacobi equations on perforated domains in the convex setting. Specifically, we examine the problem on domains with "holes" and consider three different scenarios. Additionally, the convergence rates are essentially optimal.
Day 3: Aug 9, 2024
9:00 - 9:45 am
Title: Stochastic transport with spatio-temporal marginals
Abstract: In a 1931 visionary contribution, Erwin Schrödinger, the father of Quantum Mechanics,
laid out the foundations of large deviations’ theory and of likelihood estimation
in his quest to understand how randomness creeps into the description of
the quantum world. In recent years, almost a century later, Scrödinger’s paradigm
has served as the blue print of novel stochastic control methods to regulate uncertainty
by enforcing soft-probabilistic constraints on stochastic dynamics, and furthermore,
the serendipitous confluence of stochastic control with the theory of Monge-
Kantorovich optimal mass transport has renewed interest and provided new impetus
to Schrödinger’s original program [1,2]. The talk will provide a bird’s eye overview of recent developments and then focus
on theoretical and computational advances on a novel type of control and estimation
problems [3-5], in the same vein as that of Scrödinger bridges, aka regularized
Monge-Kantorovich transport, where control design allows regulation of spatiotemporal
marginals for the given stochastic dynamics. The new formalism addresses
practical stochastic control problems where the duration of an experiment is itself
random. Such problems are typified by the landing of a module about a specified target,
following a spacial distribution that depends on the time of landing. Examples
of practical interest also include inverse problems to identify underlying stochastic
dynamics for diffusive particles from observed absorption or deposition rates.
The talk is based on joint work with Asmaa Eldesoukey and Olga Movilla Miangolarra.
[1] Tryphon T. Georgiou, and Michele Pavon. Positive contraction mappings for classical and quantum
Schrödinger systems, Journal of Mathematical Physics. 2015 Mar 1;56(3).
[2] Yongxin Chen, Tryphon T. Georgiou, and Michele Pavon. On the relation between optimal transport
and Schrödinger bridges: A stochastic control viewpoint Journal of Optimization Theory and
Applications 169 (2016): 671-691.
[3] Asmaa Eldesoukey, Olga M. Miangolarra and Tryphon T. Georgiou, An Excursion onto Schrödinger’s
Bridges: Stochastic Flows With Spatio-Temporal Marginals, in IEEE Control Systems Letters, vol.
8, pp. 1138-1143, 2024, doi: 10.1109/LCSYS.2024.3409107.
[4] Asmaa Eldesoukey, and Tryphon T. Georgiou, Schrödinger’s control and estimation paradigm
with spatio-temporal distributions on graphs, arXiv preprint arXiv:2312.05679 (2023). IEEE Trans.
on Aut. Control (accepted, to appear)
[5] Olga M. Miangolarra, Asmaa Eldesoukey, and Tryphon T. Georgiou, Inferring potential landscapes:
A Schrödinger bridge approach to Maximum Caliber, arXiv:2403.01357, 2024. Physical Review
Research (accepted, to appear)
9:45 - 10:30 am
Title: Macroscopic Dynamics for Nonequilibrium Biochemical Reactions from a Hamiltonian Perspective
Abstract: Most biochemical reactions in living cells are not closed systems; they interact with their surroundings by exchanging energy and materials. At a mesoscopic scale, the quantity of each chemical can be modeled by random time-changed Poisson processes. Understanding macroscopic behaviors is facilitated by a nonlinear reaction rate equation that describes species concentrations. In the thermodynamic limit, the large deviation rate function from the chemical master equation is governed by a Hamilton–Jacobi equation. We decompose the general macroscopic reaction rate equation into an Onsager-type strong gradient flow, supplemented by conservative dynamics. We will also present findings on the large deviation principle, diffusion approximation on the simplex, and the importance sampling of transition paths that connect metastable states in chemical reactions.
10:30 - 11:00 am Coffee Break
11:00 - 11:45 am
Abstract: We propose an actor-critic framework to solve the time-continuous stochastic optimal control problem. A least square temporal difference method is applied to compute the value function for the critic. The policy gradient method is implemented as policy improvement for the actor. Our key contribution lies in establishing the global convergence property of our proposed actor-critic flow, demonstrating a linear rate of convergence. Theoretical findings are further validated through numerical examples, showing the efficacy of our approach in practical applications.
11:45 - 12:30 pm
Title: High-order spatial discretization for variational time-implicit schemes: Wasserstein gradient flows and reaction-diffusion systems
Abstract: We develop first-order implicit-in-time variational schemes with high-order spatial discretization for gradient flows in generalized optimal transport metric spaces. Our time discretization adapts the Jordan-Kinderlehrer-Otto (JKO) scheme, which provides energy stability and first-order accuracy. Each time step involves solving an optimization problem to approximate the continuous-time gradient flow. We reformulate this problem using the dynamic (Benamou-Brenier) formulation into a saddle-point problem and apply high-order finite element methods for spatial discretization. To solve the resulting discrete saddle-point problems, we employ scalable first-order optimization solvers like the Augmented Lagrangian Method (ALG). The fully discrete scheme is unconditionally energy-stable and preserves bounds of the physical variable. Numerical examples, including Wasserstein gradient flows and various reaction-diffusion systems, demonstrate the effectiveness of our approach. This work is a collaboration with Stanley Osher (UCLA) and Wuchen Li (U. South Carolina).
12:30 pm Lunch and Departure