September 5th
Title: Expectation Maximization using Generative Plug and Play
Abstract
Plug-and-Play (PnP) methods for inverse problems combine physics-based data-fitting with general image priors, including neural network denoisers. PnP has produced high-quality results in many applications but is limited in that it reconstructs only a single deterministic image and cannot directly handle blind estimation problems, in which the forward model is not completely known. In this talk, we describe Generative PnP (GPnP) and Generative PnP Expectation-Maximization (GPnP-EM), which can be used to sample from the posterior and simultaneously estimate an image along with unknown system parameters. We explain the foundation and implementation of these algorithms and show very good results on blind image deblurring and detector bias estimation for CT reconstruction.
September 12th
Title: Matrix analysis for shallow ReLU neural network least-squares approximations.
Abstract
Neural network provides an effective tool for the approximation of some challenging functions. However, fast and accurate solvers for relevant dense linear systems are rarely studied. This work gives a comprehensive characterization of the ill conditioning of some dense linear systems arising from shallow neural network least squares approximations. It shows that the systems are typically very ill conditioned, and the conditioning gets even worse with challenging functions such as those with jumps. This makes the solutions hard for typical iterative solvers. On the other hand, we can further show the existence of some intrinsic rank structures within those matrices, which make it feasible to obtain nearly linear complexity robust direct solutions. Most of our discussions focus on the 1D case, but extensions to some 2D cases are also given.
September 19th
Title: Continuous Time Reinforcement Learning in the Rough Setting
Abstract
Reinforcement learning (RL) is one of the three main paradigms of machine learning. Traditionally, it has been studied in discrete time and space via Markov decision processes. In 2020, Wang, Zariphopoulou and Zhou [WZZ] formulated a continuous version of RL under the machinery of stochastic control theory and proved results under this formulation. Non-Markovian dynamics in mind, Chakraborty Honnappa and Tindel [CHT] recasted this formulation and introduced rough paths as the "random" driver, instead of Brownian motion. In this talk we will review the [WZZ] construction and interpretation of continuous time RL. Moreover, we will mention the results by [CHT] and present our new results in this direction. This is based on a joint work with Prakash Chakraborty, Harsha Honnappa and Samy Tindel.
September 26th
Title: In-Context Operator Learning on the Space of Probability Measures
Abstract
We introduce in-context operator learning on probability measure spaces for optimal transport (OT). The goal is to learn a single solution operator that maps a pair of distributions to the OT map, using only few-shot samples from each distribution as a prompt and without gradient updates at inference. We establish generalization bounds that quantify how in-context accuracy scales with prompt size, intrinsic task dimension, and model capacity. Our numerical experiments on synthetic transports and generative-modeling benchmarks validate the framework.
October 3rd
Title: Provable Nonlinear Regression In-Context
Abstract
Trained transformer models exhibit a powerful phenomenon known as in-context learning (ICL): at inference time, they can learn from examples provided as part of the prompt without any parameter updates. We study ICL in a nonlinear regression setting and, under certain assumptions, derive end-to-end generalization error bounds for regressing a function given by input-output pairs in the prompt. Along the way, we explicitly construct a transformer that performs polynomial regression in-context. We also present numerical results verifying the predicted scaling laws, as well as preliminary investigations towards interpretability of the learned mechanisms.
October 17th
Title: Boiling flow parameter estimation from boundary layer data
Abstract
Atmospheric turbulence and aero-optic effects cause phase aberrations in propagating light waves, thereby reducing effectiveness in transmitting and receiving coherent light from an aircraft. Existing optical sensors can measure the resulting phase aberrations, but the physical experiments required to induce these aberrations are expensive and time-intensive. Simulation methods could provide a less expensive alternative. For example, an existing simulation algorithm called boiling flow, which generalizes the Taylor frozen-flow method, can generate synthetic phase aberration data (i.e., phase screens) induced by atmospheric turbulence. However, boiling flow depends on physical parameters, such as the Fried coherence length $r_0$, which are not well-defined for aero-optic effects. In this paper, we introduce a method to estimate the parameters of boiling flow from measured aero-optic phase aberration data. Our algorithm estimates these parameters to fit the spatial and temporal statistics of the measured data. This method is computationally efficient and our experiments show that the temporal power spectral density of the slopes of the synthetic phase screens reasonably matches that of the measured phase aberrations from two turbulent boundary layer data sets, with errors between 8-9%. However, the Kolmogorov spatial structure function of the phase screens does not match that of the measured phase aberrations, with errors above 28%. This suggests that, while the parameters of boiling flow can reasonably fit the temporal statistics of highly convective data, they cannot fit the complex spatial statistics of aero-optic phase aberrations.
October 24th
Title: Stable and superfast divide-and-conquer singular value decomposition for rank-structured matrices
Abstract
A superfast divide-and-conquer algorithm for the singular value decomposition (SVD) of hierarchical semiseparable (HSS) matrices is introduced. Unlike approaches based on symmetrization, the method works directly with nonsymmetric or rectangular matrices by reducing them to a hierarchical block broken arrow form via stable QR factorizations. This form is further decomposed employing recursive rank-1 SVD updates in the conquering stage. Several stability-preserving mechanisms are incorporated, including deflation, splitting and local shifting, and orthogonality-preserving perturbations, to ensure the robustness of this stage. Meanwhile, the efficiency is improved using fast kernel methods such as the fast multipole method (FMM). A rigorous backward error analysis establishes the numerical stability of the overall process. Numerical experiments demonstrate significant advantages in both accuracy and efficiency.
October 31st
Title: Efficient Neural Network Methods for Numerical PDEs: Singularly Perturbed Problems
Abstract
Approximating the solutions to partial differential equations (PDEs) by continuous piecewise linear functions is the core idea of Finite Element Methods. In one dimension, the interval [0,1] is commonly partitioned uniformly, and an approximation is obtained by using a continuous piecewise linear function with respect to this partition. If more accuracy is needed, then more mesh points are added. Shallow ReLU neural networks, as a class of approximating functions, generate functions that are continuous and piecewise linear, but now the mesh points are parameters to be determined. In other words, when optimizing some parameters in a neural network, we are moving the mesh. This feature is the one we are interested in exploiting, since it can be helpful for problems where solutions are discontinuous and the interface is unknown. However, this process of moving the mesh is a computationally intensive task that brings several difficulties. I will talk about these difficulties and how to overcome them.
November 7th
Title: Localizing Cavities in Biharmonic Scattering with Limited Data
Abstract
In this talk, we discuss an extension of the relatively new Extended Sampling Method (ESM) to inverse shape problems in biharmonic wave scattering with limited-aperture far-field data. This computationally efficient method sets up ill-posed integral equations and uses approximate (regularized) solutions to reconstruct the location of the target. Here, the kernels of the associated integral operators are the far field data of sound soft disks. The measured data of the biharmonic wave propagation model is moved to right hand sides of the equations, which gives the method the ability to process limit aperture data. Numerical experiments show that the method directly identifies the location of the target with minimal prior information and limited data. This is based on joint work with Peijun Li and Isaac Harris.
November 14th
Title: TBA
Abstract
TBA
November 21st
Title: TBA
Abstract
TBA