Day 1: September 27, 2025
8:50 - 9:00 am
Openning remarks
9:00 - 9:30 am
Title: Data-Efficient Kernel Methods for Learning Differential Equations and Their Solution Operators: Algorithms and Error Analysis
Abstract: We introduce a novel kernel-based framework for learning differential equations and their solution maps, which is efficient in terms of data requirements (both the number of solution examples and the amount of measurements from each example), as well as computational cost and training procedures. Our approach is mathematically interpretable and supported by rigorous theoretical guarantees in the form of quantitative worst-case error bounds for the learned equations and solution operators. Numerical benchmarks demonstrate significant improvements in computational complexity and robustness, achieving one to two orders of magnitude improvement in accuracy compared to state-of-the-art algorithms. This presentation is based on joint work with Yasamin Jalalian, Juan Felipe Osorio Ramirez, Alexander Hsu, and Bamdad Hosseini. A preprint is available at: https://arxiv.org/abs/2503.01036
9:30 - 10:00 am
Abstract: Mean-Field Games (MFGs) study the Nash equilibrium of non-cooperative games involving a continuum of players. They have broad applications and deep connections to areas such as reinforcement learning, sampling, optimal transport and flow-based generative models, etc. In this talk, I will present our recent works in both forward and inverse problems in MFGs.
I will begin by presenting a convergence analysis of a learning algorithm for MFGs. Our results highlight the central role of the best response in understanding both the game dynamics and the algorithm behavior. Then, I will introduce a simple and efficient iterative strategy for solving a class of inverse MFG problems. This approach shows that measurements of the Nash equilibrium state can be remarkably effective in inferring unknown ambient potentials, such as obstacles.
This talk is based on joint works with Jiajia Yu, Xiuyuan Cheng, and Hongkai Zhao.
10:00 - 10:30 pm
Title: Transfer learning on multi-dimensional data
Abstract: The development of efficient surrogates of partial differential equations (PDEs) is a critical step toward scalable modeling of complex, multiscale systems-of-systems. Convolutional neural networks (CNNs) have gained popularity as the basis for such surrogate models due to their success in capturing high-dimensional input–output mappings and the negligible cost of a forward pass. However, the high cost of generating training data—typically via classical numerical solvers—raises the question of whether these models are worth pursuing over more straightforward alternatives with well-established theoretical foundations such as Monte Carlo (MC) methods. To reduce the cost of data generation, we propose training a CNN surrogate model on a mixture of high and low fidelity data. These data are generated as numerical solutions obtained on fine and coarse meshes or as a (d−1)-dimensional approximation of the d-dimensional problem. We demonstrate our approach on a multiphase flow test problem, using transfer learning to train a dense, fully convolutional encoder-decoder CNN on the two classes of data. Numerical results from a sample uncertainty quantification (UQ) task demonstrate that our surrogate model outperforms MC with several times the data generation budget. This presentation is based on two publications: Song and Tartakovsky, J. Mach. Learn. Model. Comput., 3(1), 31-47, 2021; and Propp and Tartakovsky, J. Mach. Learn. Model. Comput., 6(2), 13-27, 2025.
10:30 - 11:00 am Coffee Break
11:00 - 11:30 pm
Title: Exploiting low-dimensional data structures and understanding neural scaling laws of transformers
Abstract: When training deep neural networks, a model’s generalization error is often observed to follow a power scaling law dependent on the model size and the data size. Perhaps the best-known example of such scaling laws is for transformer-based large language models (LLMs), where networks with billions of parameters are trained on trillions of tokens of text. A theoretical interest in LLMs is to understand why transformer scaling laws exist. To answer this question, we exploit low-dimensional structures in language datasets by estimating its intrinsic dimension and establish statistical estimation and mathematical approximation theories for transformers to predict the scaling laws. By leveraging low-dimensional data structures, we can explain transformer scaling laws in a way which respects the data geometry. Furthermore, we test our theory with empirical observations by training LLMs on language datasets and find strong agreement between the observed empirical scaling laws and our theoretical predictions.
11:30 - 12:00 pm
Title: Deep Operator Learning Approximation and Distributed Applications
Abstract: Neural operators are deep learning architectures designed to approximate operators, which are mappings between infinite-dimensional function spaces. They have been widely applied to solve problems involving partial differential equations, such as predicting solutions from given initial or boundary conditions. Despite their empirical success, some theoretical questions remain unresolved. In this talk, we will discuss the analysis of the error convergence and generalization; the results are valid for a broad class of widely used neural operators. These theoretical developments further motivate the design of distributed and federated learning algorithms that leverage the underlying structure of neural operator approximations to address two key challenges in practical applications: (1) handling heterogeneous and multiscale input functions, and (2) extending the framework to a multi-operator learning setting to enable generalization to previously unseen tasks. Numerical evidence regarding those applications will be presented.
12:00 - 1:30 pm Lunch Break
1:30 - 2:00 pm
Title: Data Driven Modeling for Scientific Discovery and Digital Twins
Abstract: We present a data-driven modeling framework for scientific discovery, termed Flow Map Learning (FML). This framework enables the construction of accurate predictive models for complex systems that are not amenable to traditional modeling approaches. By leveraging measurement data and the expressiveness of deep neural networks (DNNs), FML facilitates long-term system modeling and prediction even when governing equations are unavailable.
FML is particularly powerful in the context of Digital Twins, an emerging concept in digital transformation. With sufficient offline learning, FML enables the construction of simulation models for key quantities of interest (QoIs) in complex Digital Twins, even when direct mathematical modeling of the QoI is infeasible. During the online execution of a Digital Twin, the learned FML model can simulate and control the QoI without reverting to the computationally intensive Digital Twin itself.
As a result, FML serves as an enabling methodology for real-time control and optimization of the physical twin, significantly enhancing the efficiency and practicality of Digital Twin applications.
2:00 - 2:30 am
Abstract: Dynamical transport has achieved state-of-the-art success in generative modeling for computer vision data. Here I'll focus on generative modeling of multiscale scientific data that are typically numerically ill-conditioned, such as those arising from Gaussian free fields or invariant distributions of stochastic PDEs. Faithfully generating samples that reproduce fine-scale features is challenging. I'll present optimal Lipschitz energy criteria for designing measure transport in generative modeling, as an alternative to optimal kinetic energy in optimal transport. By analytically and numerically optimizing these Lipschitz criteria over linear stochastic interpolants—particularly regarding noise and interpolation schedule design—we obtain generative flows that integrate with lower computational cost while maintaining robust spectral performance across different resolutions. This is demonstrated theoretically for Gaussian free fields and numerically for invariant distributions of stochastic Allen-Cahn and Navier-Stokes equations.
2:30 - 3:00 pm Coffee Break and group photo
3:00 - 3:30 pm
Title: Adaptive Conditional Diffusion for Time-Varying Inverse Problems
Abstract: Conditionally guided generative diffusion is the state-of-the-art method for creating highly accurate representations of complex high-dimensional objects, including everything from megapixel images to 3D protein structures. A fundamental limitation of generative methods, including diffusion, is an inability to adapt in real-time to time-varying systems with large distribution shifts. This talk presents an approach that incorporates advanced model-independent adaptive feedback control algorithms together with the generative diffusion process for adaptive conditionally guided tracking of time-varying systems. This general approach is demonstrated for solving extreme inverse problems of mapping noisy low-dimensional signals to high-dimensional time-varying distributions and images for complex charged particle beam dynamics in high energy particle accelerators.
3:30 - 4:00 pm
Title: Neural Operators for Learning and Solving Nonlinear PDEs
Abstract: Nonlinear partial differential equations (PDEs) play a central role in science and engineering, yet learning and solving them with neural networks remain challenging due to nonlinear complexity, limited data, and the presence of multiple solutions. I will introduce two complementary neural operator frameworks to address these issues. The Laplacian Eigenfunction-Based Neural Operator (LE-NO) leverages Laplacian eigenfunctions as basis functions to efficiently approximate nonlinear terms, reduce computational complexity, and generalize across boundary conditions. To handle multiple-solution nonlinear PDEs, we propose the Newton-Informed Neural Operator (NINO), which integrates classical Newton methods with operator learning to overcome ill-posedness and efficiently capture multiple solutions with fewer data requirements. Together, these approaches provide powerful tools for modeling, discovery, and prediction in complex dynamical systems.
4:15 - 5:35 pm
Speaker 1: Xiaoou Cheng (NYU) 4:15 - 4:25 pm
Title: The surprising efficiency of temporal difference learning for rare event prediction
Abstract: We quantify the efficiency of temporal difference (TD) learning over the direct, or Monte Carlo (MC), estimator for policy evaluation in reinforcement learning, with an emphasis on estimation of quantities related to rare events. Policy evaluation is complicated in the rare event setting by the long timescale of the event and by the need for \emph{relative accuracy} in estimates of very small values. Specifically, we focus on least-squares TD (LSTD) prediction for finite state Markov chains, and show that LSTD can achieve relative accuracy far more efficiently than MC. We prove a central limit theorem for the LSTD estimator and upper bound the \emph{relative asymptotic variance} by simple quantities characterizing the connectivity of states relative to the transition probabilities between them. Using this bound, we show that, even when both the timescale of the rare event and the relative accuracy of the MC estimator are exponentially large in the number of states, LSTD maintains a fixed level of relative accuracy with a total number of observed transitions of the Markov chain that is only \emph{polynomially} large in the number of states.
Speaker 2: Xianjin Yang (CalTech) 4:25 - 4:35 pm
Title: Gaussian Processes for Solving Functional PDEs: Applications to Functional Renormalization Group Equations
Abstract: We present a Gaussian Process framework for solving non-perturbative Functional Renormalization Group equations by projecting their infinite-dimensional integro-differential form onto finite-dimensional ODE systems via kernel interpolation. The method incorporates physical priors through kernel design while avoiding the need for problem-specific numerical solvers. Numerical experiments demonstrate that the approach achieves high accuracy and broad applicability, offering a flexible, nonparametric, and physically guided alternative to traditional solvers.
Speaker 3: Yasamin Jalalian (CalTech) 4:35 - 4:45 pm
Title: Data-Efficient Kernel Methods for PDE Discovery
Abstract: For many problems in computational sciences and engineering, observational data exists for which the underlying physical models are not known. PDE discovery methods provide systematic ways to infer these physical models directly from data. We introduce a framework for identifying and solving PDEs using kernel methods. In particular, given observations of PDE solutions and source terms, we employ a kernel-based data-driven approach to learn the functional form of the underlying equation. We prove convergence guarantees and a priori error estimates for our methodology. Through numerical experiments, we demonstrate that our approach is particularly competitive in the data-poor regime where few observations are available.
Speaker 4: Mengze Wang (MIT) 4:45 - 4:55 pm
Title: A non-intrusive machine-learning framework for debiasing climate emulations
Abstract: Associated with the rapidly changing climate, the frequency and severity of extreme weather events have been increasing over the past decades. Since performing fully‐resolved climate simulations is computationally intractable, stakeholders and policymakers must rely on coarse models or emulators to quantify the risk for extremes. However, coarse models suffer from inherent bias due to the ignored “sub‐grid” scales. In this work, we aim at developing data-driven correction operators for coarse scale emulators. Previous efforts have attempted to train such operators by merely matching the statistics. Nevertheless, this approach falls short with events that have longer return period than that of the training data, since the reference statistics have not converged. To overcome such a limitation, we introduce a dynamical systems approach where the correction operator is trained using reference data and a coarse model simulation nudged toward that reference. This framework is applied to debiasing a linear Gaussian stochastic emulator, which is constructed to accurately capture the second-order statistics of the ERA5 data. With the nonlinear correction, the predicted higher-order moments of the local climate variables, including wind speed, temperature, and humidity, are more accurate than the free-running emulator. The non-Gaussian probability distributions are better estimated, particularly for the tails that correspond to extreme events with long return periods. We will also discuss the potential application of this framework to other emulators or coarse climate models.
Speaker 5: Kai Chang (MIT) 4:55 - 5:05 pm
Title: Extreme Event Aware Learning
Abstract: Quantifying and predicting rare and extreme events persists as a crucial yet challenging task in understanding complex dynamical systems, ubiquitous in science and engineering. Many practical challenges arise from the infrequency and severity of these events, including the considerable variance of simple sampling methods and the substantial computational cost of high-fidelity numerical simulations. Numerous data-driven methods have recently been developed to tackle these challenges. However, a typical assumption for the success of these methods is the occurrence of multiple extreme events, either within the training dataset or during the sampling process. This leads to accurate models in regions of quiescent events but with high epistemic uncertainty in regions associated with extremes. To overcome this limitation, we introduce the framework of Extreme Event Aware (e2a or eta) or η-learning which does not assume the existence of extreme events in the available data. η-learning reduces the uncertainty even in `unchartered' extreme event regions, by enforcing the extreme event statistics of a few observables during training, which can be available or assumed through qualitative arguments or other forms of analysis. This type of statistical regularization results in models that fit the observed data, but also enforces consistency with the prescribed statistics of some observables, enabling the generation of unprecedented extreme events even when the training data lack extremes therein. Theoretical results based on optimal transport offer a rigorous justification and highlight the optimality of the introduced method. Additionally, extensive numerical experiments illustrate the favorable properties of the η-learning framework on several prototype problems and real-world precipitation downscaling problems.
Speaker 6: Charlotte Moser (UW-Madison) 5:05 - 5:15 pm
Title: Bridging the Model Hierarchy: A Physics-Guided Machine Learning Framework for High-Resolution Climate Simulation Enhancement
Abstract: Operational models are widely used to understand and predict natural phenomena. They are high-resolution and contain many crucial variables. Despite numerous successes, biases persist in most of these models, especially in identifying extreme events and reproducing the observed statistics in nature. However, due to the complexity of these models, it is challenging to directly modify them to improve accuracy. On the other hand, conceptual and intermediate coupled models accurately characterize certain features of nature. Yet, they contain only a subset of variables within a specific domain and at lower resolutions. By leveraging the strengths of different models, we develop a robust physics-driven machine learning modeling framework that bridges the model hierarchy through effective latent space data assimilation. It integrates models of varying complexities to capture their respective advantages, enabling simpler models to directly support operational models. The resulting model not only inherits the benefits of the operational models, including their high resolution and comprehensive set of variables, but also globally enhances the accuracy through local dynamical and statistical improvements provided by the simpler models. The latent space technique identifies the dominant nonlinear features of the underlying dynamics, which facilitates effective communication between models. Finally, the machine learning representation of the model significantly enhances simulation efficiency, which provides massive high-quality synthetic data that mimics nature and advances the study of extreme events with uncertainty quantification. The framework has been applied to enhance the performance of CMIP6 models in characterizing El Niño complexity by utilizing simpler yet statistically accurate models.
Speaker 7: Yinling Zhang (UW-Madison) 5:15 - 5:25 pm
Title: A Causality-Based Learning Approach for Underlying Dynamics of Complex Dynamical Systems
Abstract: Discovering the underlying dynamics of complex systems from data is an important practical topic. Constrained optimization algorithms are widely utilized and lead to many successes. Yet, such purely data-driven methods may bring about incorrect physics in the presence of random noise and cannot easily handle the situation with incomplete data. In this paper, a new iterative learning algorithm for complex turbulent systems with partial observations is developed that alternates between identifying model structures, recovering unobserved variables, and estimating parameters. First, a causality-based learning approach is utilized for the sparse identification of model structures, which takes into account certain physics knowledge that is pre-learned from data. It has unique advantages in coping with indirect coupling between features and is robust to the stochastic noise. A practical algorithm is designed to facilitate the causal inference for high-dimensional systems. Next, a systematic nonlinear stochastic parameterization is built to characterize the time evolution of the unobserved variables. Closed analytic formula via an efficient nonlinear data assimilation is exploited to sample the trajectories of the unobserved variables, which are then treated as synthetic observations to advance a rapid parameter estimation. Furthermore, the localization of the state variable dependence and the physics constraints are incorporated into the learning procedure, which mitigate the curse of dimensionality and prevent the finite time blow-up issue. Numerical experiments show that the new algorithm succeeds in identifying the model structure and providing suitable stochastic parameterizations for many complex nonlinear systems with chaotic dynamics, spatiotemporal multiscale structures, intermittency, and extreme events.
Speaker 8: Zhongrui Wang (UW-Madison) 5:25 - 5:35 pm
Title: Modeling partially observed nonlinear dynamical systems and efficient data assimilation via conditional Gaussian Koopman network
Abstract: A discrete-time conditional Gaussian Koopman network (CGKN) is developed to learn surrogate models for efficient state forecasting and data assimilation (DA) in high-dimensional, complex dynamical systems. Focusing on nonlinear, partially observed systems common in engineering and Earth science, the approach leverages Koopman embedding to construct latent variables representing unobserved states, whose dynamics are conditionally linear given the observed states. This structure yields a conditional Gaussian system, enabling the posterior distribution of latent states to be evaluated analytically. These closed-form DA updates are embedded directly into the learning process, resulting in a unified framework that integrates scientific machine learning (SciML) with data assimilation. The discrete-time CGKN is tested on canonical nonlinear PDE systems with intermittent and turbulent features, demonstrating forecasting performance on par with state-of-the-art SciML methods and delivering efficient, accurate DA with uncertainty quantification. Beyond DA, the CGKN framework exemplifies how SciML models can be designed to seamlessly interface with outer-loop applications such as design optimization, inverse problems, and optimal control.
Day 2: September 28, 2025
9:00 - 9:30 am
Title: Building better representations: physics, multifidelity and kernels
Abstract: Modern machine learning, due to its ability to construct versatile representations, has shown remarkable promise in multiple applications. However, brute force use of neural networks, even when they have huge numbers of trainable parameters, can fail to provide highly accurate predictions for problems in the physical sciences. We present a collection of ideas about how enforcing physics, exploiting multifidelity knowledge and the kernel representation of neural networks can lead to significant increase in efficiency and/or accuracy. Moreover, these ideas can be viewed as part of a broader concept, compositionality, which permeates mathematics, physics and computer science.
9:30 - 10:00 am
Title: Adaptive information refinement for active media
Abstract: A prevalent paradigm in machine learning is the training of deep neural networks to construct nonlinear predictors capable of tasks such as image recognition, natural language processing, or object detection. The notion of layered descriptions of natural phenomena itself has a long history within scientific computing encompassing techniques such as multigrid, adaptive mesh refinement of wavelets. This talk introduces a technique that has a bit of the flavor of both approaches for the modeling of living tissue, a heterogeneous and active medium. Similar to adaptive mesh refinement, the coarsest spatial levels are described by partial differential equations capturing homogenized tissue response. In contrast to adaptive mesh refinement there is no expectation that finer spatial resolution leads to a convergent numerical scheme in the classical analysis sense. Rather, at small scales the marked heterogeneity of biological tissue comes into play with numerous intertwined filamentary structures interacting within a complex fluid, and with numerous active constituents such as molecular motors or metabolic reactions. In the approach presented here, the finer levels are predicted from an adaptive mesh hierarchy that uses nonlinear interpolation and prolongation operators, themselves trained on the fly from previous time steps. In essence, an adaptive neural network in continuously updated to furnish the microscopic information required to educe homogenized constitutive laws. Applications are presented for various problems in biological motility.
10:00 - 10:30 am
Title: VoroClust: scalable, density-based clustering for remote sensing
Abstract: Although supervised machine learning provides a powerful framework for image classification and segmentation, it requires comprehensive, consistent data sets, which are not available for many remote-sensing applications. Remote-sensing datasets are expensive to collect, and each is acquired under different environmental conditions, or with significant variations in system operating parameters. Unsupervised clustering algorithms analyze the structure of each dataset, rather than drawing on similarities with other examples, and are thus well suited for practical remote-sensing applications.
In this talk, we introduce Voronoi Clustering (VoroClust), a fast, density-based unsupervised clustering algorithm applicable to high-resolution and high-dimensional data. VoroClust runs as fast as distance-based clustering methods, while capturing complex regional geometries at least as well as current density-based methods. It uses a data-centered sphere cover to reduce computational demands, while still capturing data topology. It then propagates clusters outward from local peaks in density. We show that VoroClust provides fast, state-of-the-art clustering for both high-resolution polarimetric synthetic aperture radar (PolSAR) and high-dimensional hyperspectral imaging (HSI) datasets.
10:30 - 11:00 am Coffee Break
11:00 - 11:30 am
Abstract: Neural operators, such as DeepONets, have changed the paradigm in high-dimensional nonlinear regression, paving the way for significant generalization and speed-up in computational engineering applications. First, we demonstrate the use of DeepONet to infer flow fields around unseen airfoils with the aim of constrained shape optimization, an important design problem in aerodynamics that typically taxes computational resources heavily. We successfully optimize the geometries with respect to maximizing the lift-to-drag ratio and present results that display little to no degradation in prediction accuracy while reducing the online optimization cost by orders of magnitude compared to a high-order CFD solver.
Additionally, the notion of uncertainty quantification (UQ) is critical for many scientific and engineering applications, including design optimization. While several different methods for UQ have been proposed for various operator learning frameworks, the sources of uncertainty — e.g., discretization, finite sampling, noisy measurements, etc. — are often treated independently. We explore current research into the potential interaction effects of these sources of uncertainty and compare the accuracy and appropriateness of various UQ methods in the literature that propose to quantify them. The goal of the research is to provide relevant comparisons and informative guidelines for the application of UQ methods to end users seeking to apply operators for design optimization. We claim that not only should we consider optimizing an objective function with neural operator surrogates, such as lift-to-drag ratio of an airfoil, but we should also trade-off that maximization with a design that has high confidence provided by an appropriate UQ scheme.
11:30 - 12:00 am
Abstract: Operator learning is a recently developed generalization of regression to mappings between functions. It promises to drastically reduce expensive numerical integration of PDEs to fast evaluations of mappings between functional states of a system, i.e., surrogate and reduced-order modeling. Operator learning has already found applications in several areas such as modeling sea ice, combustion, and atmospheric physics. Recent approaches towards integrating uncertainty quantification into the operator models have relied on likelihood based methods to infer parameter distributions from noisy data. However, stochastic operators may yield actions from which a likelihood is difficult or impossible to construct. In this paper, we introduce, GenUQ, a measure-theoretic approach to UQ that avoids constructing a likelihood by introducing a generative hyper-network model that produces parameter distributions consistent with observed data. We demonstrate that GenUQ outperforms other UQ methods in three example problems, recovering a manufactured operator, learning the solution operator to a stochastic elliptic PDE, and modeling the failure location of porous steel under tension.
12:00 - 12:30 pm
Title: NeurAM: Combining ML-based Nonlinear Dimensionality Reduction with Multi-fidelity and Stratified Monte Carlo Estimators
Abstract: I will discuss a new approach for combining machine learning-based nonlinear dimensionality reduction with control-variate-based multi-fidelity and stratified estimators. I will detail how supervised autoencoders are used to discover a one-dimensional neural active manifold (NeurAM) in a way that effectively captures most of the model output variability. A key benefit of this process is the simultaneous learning of a surrogate model, which operates on this reduced manifold. Furthermore, I will provide both theoretical and numerical evidence demonstrating the variance reduction achievable by NeurAM, and discuss its application to the solution of inverse problems.
12:30 pm - 2:00pm Lunch Break
2:00 - 2:30 pm
Title: Energetic Variational Neural Network Discretizations Of Gradient Flows
Abstract: In this talk, I will describe structure-preserving neural-network-based numerical schemes to solve both L2-gradient flows and generalized diffusions. By using neural networks as tools for spatial discretization, we introduce a structure-preserving Eulerian algorithm to solve L2-gradient flows and a structure-preserving Lagrangian algorithm to solve generalized diffusions. The Lagrangian algorithm for a generalized diffusion evolves the “flow map" which determines the dynamics of the system. This avoids the non-trivial task of computing the Wasserstein distance between two probability functions. Unlike most existing methods that construct numerical discretizations based on the strong or weak form of the underlying PDE, our schemes are constructed using variational formulations of these PDEs for preserving their variational structures. Instead of directly solving the obtained nonlinear systems after temporal and spatial discretization, the minimizing movement scheme is utilized to evolve the solutions. This guarantees the monotonic decay of the energy of the system, and is crucial for the long-term stability of numerical computation. I will describe a few numerical experiments are presented to demonstrate the accuracy and energy stability of the numerical schemes.
2:30 - 3:00 pm
Title: Scaling Scientific Machine Learning: Integrating Theory and Numerics in Both Training and Inference
Abstract: Scaling scientific machine learning (SciML) requires overcoming bottlenecks at both training and inference. On the training side, we study the statistical convergence rate and limits of deep learning for solving elliptic PDEs from random samples. While our theory predicts optimal polynomial convergence for PINNs, optimization becomes prohibitively ill-conditioned as networks widen. By adapting descent strategies to the optimization geometry, we obtain scale-invariant training dynamics that translate polynomial convergence into concrete compute and yield compute-optimal configurations. On the inference side, I will introduce Simulation-Calibrated SciML (SCaSML), a physics-informed post-processing framework that improves surrogate models without retraining or fine-tuning. By enforcing physical laws, SCaSML delivers trustworthy corrections (via Feynman-Kac simulation) with approximate confidence intervals, achieves faster and near-optimal convergence rates, and supports online updates for digital twins. Together, these results integrate theory and numerics to enable predictable, reliable scaling of SciML in both training and inference. This is based on joint work with Lexing Ying, Jose Blanchet, Haoxuan Chen, Zexi Fan, Youheng Zhu, Shihao Yang, Jasen Lai, Sifan Wang, and Chunmei Wang.
3:15 - 4:25 pm
Speaker 1: Harihara Maharna (Notre Dame) 3:15 - 3:25 pm
Title: Energetic Variational Neural Network Discretization of the Cahn-Hilliard Equation
Abstract: We present a structure-preserving Lagrangian algorithm for solving the Cahn-Hilliard equation. The algorithm employs neural networks as tools for spatial discretization. The proposed scheme is constructed based on the energy-dissipation law directly. This guarantees the monotonic decay of the system's free energy, which avoids unphysical states of solutions and is crucial for the long-term stability of numerical computations. To address challenges arising from interface problems, we introduce an adaptive sampling method for better capturing the diffuse-interface. Moreover, we solve for the incremental of the flow map. This approach is computationally memory-efficient. The proposed neural network-based scheme is mesh-free, allowing us to solve gradient flows in high dimensions. Numerical experiments are presented to demonstrate the accuracy and energy stability of the proposed numerical schemes.
Speaker 2: Yixuan Sun (ANL) 3:25 - 3:35 pm
Title: Matrix-free Neural Preconditioners for the Dirac Equations in Lattice Gauge Theory
Abstract: Linear systems arise in calculating observables in lattice quantum chromodynamics (QCD). Solving these Hermitian positive definite systems, which are sparse but ill-conditioned, typically requires iterative methods such as Conjugate Gradient (CG), which are time-consuming and computationally expensive. Preconditioners can accelerate this process, but constructing them is often challenging and adds computational overhead, especially in large systems. In this talk, I will present a framework that leverages operator learning techniques to construct linear maps as effective preconditioners without relying on explicit matrices, enabling efficient training and integration with the CG solver. In the Schwinger model (U(1) gauge theory in 1+1 dimensions), this scheme reduces condition numbers and halves iteration counts in relevant parameter ranges, while also demonstrating zero-shot learning ability across different lattice sizes.
Speaker 3: Tong Ding (Purdue) 3:35 - 3:45 pm
Title: Matrix analysis for shallow ReLU neural network least-squares approximations
Abstract: Neural network provides an effective tool for the approximation of some challenging functions. However, fast and accurate solvers for relevant dense linear systems are rarely studied. This work gives a comprehensive characterization of the ill conditioning of some dense linear systems arising from shallow neural network least squares approximations. It shows that the systems are typically very ill conditioned, and the conditioning gets even worse with challenging functions such as those with jumps. This makes the solutions hard for typical iterative solvers. On the other hand, we can further show the existence of some intrinsic rank structures within those matrices, which make it feasible to obtain nearly linear complexity robust direct solutions. Most of our discussions focus on the 1D case, but extensions to some 2D cases are also given.
Speaker 4: Hojin Kim (Purdue) 3:45 - 3:55 pm
Title: Differentiable physics for generalizable closure modeling of separated flows
Abstract: The computational modeling of turbulent flows is challenging due to the high computational requirements of resolving all spatial and temporal scales. Machine learning (ML) methods have been proposed for the construction of turbulence closures which can alleviate these requirements by modeling the effects of unresolved structures on resolved quantities. However, several ML-based turbulence models show weak generalization capabilities when faced with varying geometries and consequent closure requirements. In this talk, we will showcase results from a differentiable programming framework to learning generalizable closure models. Specifically, our framework involves the training of a graph-neural network (GNN) model for subgrid stresses which is embedded in a finite-element (FEM) solver. This is achieved by chaining gradients computed by automatic differentiation for the GNN with the discrete adjoint of the FEM solver. This enables for the learning of a subgrid stress given access to a quantity of interest information from the fully resolved flow-field. In this research, we leverage the mesh invariant property of GNNs to learn subgrid models for separated flows compiled from a range of different separation physics (i.e., both smooth, sharp and for various geometries). Our formulation enables for a single GNN-based subgrid closure model that generalizes across different geometries as well as separation phenomena and supports the conclusion that generalizable ML closures may be constructed using the differentiable physics.
Speaker 5: Yuezhu Xu (Purdue) 3:55 - 4:05 pm
Title: Learning Neural Dynamical Systems with Dissipative Guarantees
Abstract: We study data-driven identification of dissipative neural dynamical models for unknown nonlinear systems known a priori to be dissipative. Because enforcing dissipation during training is challenging, we adopt a two-stage “fit-then-adjust” framework with two instantiations: (i) a direct neural state-update model and (ii) a neural Koopman model. In Stage 1, we fit an accurate unconstrained model to trajectory data. In Stage 2, we minimally modify the learned parameters to certify strict dissipativity. We derive sufficient conditions that (a) certify dissipativity of the identified model itself and (b) provide certificates that transfer dissipativity back to the underlying system, enabling closed-loop guarantees for controllers designed from the model. Experiments on the Duffing oscillator demonstrate that the resulting models closely match the data while enjoying provable dissipativity guarantees.
Speaker 6: Asini Anuradhika Konpola (Purdue) 4:05 - 4:15 pm
Title: Bayesian inference for an agent-based model of pattern formation in zebrafish skin
Abstract: Complex systems in biology are those in which the interactions among individual agents give rise to diverse group dynamics. To better understand the factors driving these dynamics, we estimate parameters of the mathematical models that simulate such systems. In particular, we build on an existing model of pattern formation in zebrafish skin and apply the Approximate Approximate Bayesian Computation (AABC) method to estimate the model parameters. We use pair-correlation functions of two distinct cell types in the zebrafish skin as summary statistics in the AABC method. Our results highlight which zebrafish mutants, summary statistics, and developmental time points provide the most informative insights into the mechanisms behind pattern formation. We also discuss our analysis to determine the hyperparameter values in the AABC method that yield the most reliable parameter estimates.
Speaker 7: Sreehari Manikkan (Purdue) 4:15 - 4:25 pm
Title: Thermal lumped parameter network with entropy switch for Bayesian network discovery
Abstract: Thermal resistance-capacitance (RC) networks are widely adopted grey box models for thermal systems spanning buildings, batteries, electronic devices, and biological systems. These models support applications such as model predictive control, digital twins, and fault detection, where both interpretability and predictive accuracy are essential. In real world settings where physical properties are often unknown, inverse methods are required to infer both the RC network structure and its parameters from data. However, existing calibration techniques rely on deterministic optimization, producing point estimates that are susceptible to overfitting and lack uncertainty quantification posing risks in safety critical applications. Current network discovery methods typically involve multiple calibrations across candidate models, are often application specific, and lack both uncertainty quantification and adaptability. While Bayesian methods offer a principled framework for uncertainty quantification and robustness to noisy data, they remain under explored in thermal RC network discovery. Furthermore, the literature lacks a unified, and application agnostic representation of thermal RC networks. To address these limitations, we propose a generic, dynamic, and adaptable thermal RC network framework equipped with a thermodynamics-based switch, called entropy switch, for Bayesian network discovery. The proposed approach simultaneously infers both unknown parameters and the underlying network structure through Bayesian inference. We validate and verify the framework using a 4R3C synthetic example under varying temperature noise levels, considering both fully and partially observed scenarios. We further demonstrate its applicability on experimental data from four offices of a real building and a group of residential buildings. In the former case, when trained on nearly one day of data, the method identifies networks capable of accurately predicting temperatures up to nine days ahead. For the residential building case study, where missing data, stochastic dynamics and more uncertainty in input signals are present, the future prediction matches quantitatively for 36 hours and qualitatively beyond that. The concept of entropy switches is highly generalizable, and we hope it will inspire adoption in broader domains.
4:30 pm Discussion and Departure