Title: Towards Hybrid Neural Solvers based on Numerical Simulations and Deep Learning
Abstract:
This presentation will target recent advancements from the area of deep learning for physics simulations. A key focus is on the utilization of numerical solvers capable of providing gradient information, i.e. "differentiable simulators". These solvers seamlessly integrate with deep learning algorithms, presenting several advantages in practical scenarios, particularly in the context of flow simulations. However, the availability of gradient computation is not ubiquitous in many existing fluid simulation environments. Consequently, I will demonstrate a strategic approach to leverage non-differentiable simulators, serving as a transitional step and a middle ground in this context. As an outlook, I will explore the potential integration of these methods with diffusion modeling techniques, which offer powerful tools for handling uncertainties.
Title: Birth of Diffusion Models: Insights into Their Connection with PDE solving
Abstract:
Diffusion models, pioneers in Generative AI, have significantly propelled the creation of synthetic images, audio, 3D objects/scenes, and proteins. Beyond their role in generation, these models have found practical applications in tasks like media content editing/restoration, as well as in diverse domains such as robotics learning.
In this talk, we'll explore the origins of diffusion models, gaining insights into their mechanisms as differential equations (DE) solving (Song et al. ICLR 2020). With this, we introduced, FP-Diffusion (Lai et al. ICML 2023), improves the diffusion model by aligning it with its underlying mathematical structure (a system of PDEs), the Fokker-Planck (FP) Equation. Additionally, the link between diffusion models and DE solving reveals limitations associated with the slow sampling speed of thousand-step generation. Motivated by this, we'll introduce the Consistency Trajectory Model (CTM) (Kim & Lai et al. ICLR 2024), an innovative method enabling one-step diffusion model generation while preserving high fidelity and diversity. The aim of this talk is to inspire mathematical research into diffusion models or general deep learning methods for solving differential equations.
Diffusion model 是生成人工智慧的先驅,推動了合成圖像、音頻、3D物體/場景和蛋白質的生成。除了在生成中的作用外,這些模型在媒體內容編輯/修復等任務中發現了實際應用,且還應用於機器人學習等各種領域。
在這場演講中,我們將探討 Diffusion model 的起源,深入了解其和求解微分方程的關聯(Song et al. ICLR 2020) 。基於此,我們引入了FP-Diffusion (Lai et al. ICML 2023),通過使其與其本質的數學結構 (一個偏微分方程組)- Fokker-Planck(FP)方程式更吻合,而改進了 Diffusion model 。此外,Diffusion model 與DE求解(通常需要數百至千步)之間的聯繫揭示了其生成慢取樣速度的限制。在這方面的動機下,我們將介紹一種創新方法,即Consistency Trajectory Model (CTM) (Kim & Lai et al. ICLR 2024),實現了一步 Diffusion model生成,同時保持高度保真度和多樣性。這次演講的目的是激發對擴散模型或一般深度學習方法用於求解微分方程的數學研究。
Title: Analysis and computation of plane-parallel flows at a small viscosity
Abstract:
Singular perturbations occur when a small coefficient affects the highest order derivatives in a system of partial differential equations. From the physical point of view, singular perturbations lead to the formation of narrow regions close to the boundary of a domain, known as boundary layers, where numerous crucial physical processes occur. This presentation explores the analysis of viscous boundary layers in plane-parallel flows and their utilization in developing efficient numerical methods, such as the Physics Informed Neural Networks (PINNs).
Title: Optimal learning of piecewise smooth functions
Abstract:
Deep learning has established itself as, by far, the most successful machine learning approach in sufficiently complex tasks. Nowadays, it is used in a wide range of highly complex applications such as natural language processing or even scientific applications. Its first major breakthrough, however, was achieved by shattering the state-of-the-art in image classification. We revisit the problem of classification or more general learning of functions with jumps by deep neural networks and attempt to find an answer to why deep networks are remarkably effective in this regime. We will interpret the learning of classifiers as finding piecewise constant functions from labelled samples. Piecewise constant/smooth functions also appear in many applications associated with physical processes such as in transport problems where shock fronts develop. We then precisely link the hardness of the learning problem to the complexity of the regions where the function is smooth. Concretely, we will establish fundamental lower bounds on the learnability of certain regions. Finally, we will show that in many cases, these optimal bounds can be achieved by deep-neural-network-based learning. In quite realistic settings, we will observe that deep neural networks can learn high-dimensional classifiers without a strong dependence of the learning rates on the dimension.
Title: Accurate, efficient, and reliable learning of deep neural operators
Abstract:
It is widely known that neural networks (NNs) are universal approximators of functions. However, a less known but powerful result is that a NN can accurately approximate any nonlinear operator. This universal approximation theorem of operators is suggestive of the potential of deep neural networks (DNNs) in learning operators of complex systems. In this talk, I will present the deep operator network (DeepONet) to learn various operators that represent deterministic and stochastic differential equations. I will also present several extensions of DeepONet, such as DeepM&Mnet for multiphysics problems, DeepONet with proper orthogonal decomposition or Fourier decoder layers, MIONet for multiple-input operators, and multifidelity DeepONet. I will demonstrate the effectiveness of DeepONet and its extensions to diverse multiphysics and multiscale problems, such as bubble growth dynamics, high-speed boundary layers, electroconvection, hypersonics, geological carbon sequestration, and full waveform inversion. Deep learning models are usually limited to interpolation scenarios, and I will quantify the extrapolation complexity and develop a complete workflow to address the challenge of extrapolation for deep neural operators.
Title: Optimal Approximation Rates for Neural Networks on Sobolev and Besov Spaces
Abstract:
Neural networks have recently been widely applied to problems in machine learning and scientific computing. In an effort to understand their potential, we will study how efficiently neural networks can approximate functions from Sobolev and Besov spaces. We will focus on the problem of determining optimal $L_p$-approximation rates on the Sobolev space $W^s(L_q)$ using deep neural networks with the ReLU$^k$ activation function. Existing sharp results for this problem are only available when $q=\infty$, i.e. when the derivatives are measured in $L_\infty$. In our work, we extend these results and determine the best possible rates for all $p,q$ and $s$ for which a compact Sobolev embedding holds, i.e. when $s/d > 1/q - 1/p$. We will discuss some of the technical details of the proof and also indicate what happens at the boundary of the Sobolev embedding, i.e. when $s/d = 1/q - 1/p$ . If time permits, we will discuss recent progress and open problems on the analogous problem for shallow ReLU$^k$ networks.
Title: N-adaptive Ritz method: A neural network enhanced computational mechanics framework
Abstract:
Conventional finite element methods are known to be tedious in adaptive refinements due to their conformal regularity requirements. Meshfree methods, such as the Reproducing Kernel Particle Method, relaxed the discretization and approximation regularity requirements in finite elements, however the enrichment functions for adaptive refinements are not readily available in general applications. The fast-growing research and development in data science, machine learning and artificial intelligence offer new opportunities for the development of new paradigms in scientific computing. This talk presents a neural network (NN) enhancement of Galerkin solution for weak and strong discontinuities, as well as for solution enhancement with a fixed discretization, called N- adaptivity. The flexibility and adaptivity of the NN function space are utilized to capture complex solution patterns that the conventional Galerkin methods fail to capture. The NN enrichment is constructed by combining pre-trained NN blocks with an additional untrained NN block. The pre-trained NN blocks learn specific complex solution patterns during the offline stage, enabling efficient enrichment of the approximation space during the online stage through the Ritz-type energy minimization. The NN enrichment is introduced under the Partition of Unity (PU) framework, ensuring convergence of the proposed method.
Short Bio: J. S. Chen is the William Prager Chair Professor and Distinguished Professor of Structural Engineering Department, Mechanical & Aerospace Engineering Department, and the Founding Director of Center for Extreme Events Research at UC San Diego. Before joining UCSD in 2013, he was the Chancellor’s Professor of UCLA Civil & Environmental Engineering Department, Mechanical & Aerospace Engineering Department, and Mathematics Department, where he served as the Department Chair of Civil & Environmental Engineering during 2007-2012. J. S. Chen's research is in computational mechanics and multiscale materials modeling with specialization in the development of meshfree methods. He is the Past President of US Association for Computational Mechanics (USACM) and the Past President of ASCE Engineering Mechanics Institute (EMI). He has received numerous awards, including the Computational Mechanics Award from International Association for Computational Mechanics (IACM), the Grand Prize from Japan Society for Computational Engineering and Science (JSCES), the Ted Belytschko Applied Mechanics Award from ASME Applied Mechanics Division, the Belytschko Medal from U.S. Association for Computational Mechanics (USACM), the Computational Mechanics Award from Japan Association for Computational Mechanics (JACM), the ICACM Award from International Chinese Association for Computational Mechanics (ICACM), among others. He is the Fellow of USACM, IACM, ASME, EMI, SES, ICACM, and ICCEES. He received BS (Civil Engineering) from National Central University, Taiwan, and MS and PhD (Theoretical & Applied Mechanics) from Northwestern University.
Title:
Robust Model Discovery with SINDy and Ensemble Learning
Abstract:
The sparse identification of nonlinear dynamics (SINDy) algorithm can identify dynamical system models purely from data. In this talk, I will present recent work on extending the SINDy algorithm using ensemble learning to identify interpretable and generalizable models in the low-data and high-noise limit. We apply the ensemble-SINDy (E-SINDy) algorithm to a range of challenging synthetic and real-world data sets and demonstrate substantial improvements to the accuracy and robustness of model discovery from noisy and limited data. E-SINDy is computationally efficient, with similar scaling as standard SINDy. We show that E-SINDy can perform efficient uncertainty estimation and probabilistic forecasts, compared to expensive Bayesian uncertainty quantification methods via MCMC. Finally, we show that ensemble statistics from E-SINDy can be used for active learning and improved model predictive control.
References
U. Fasel, J. N. Kutz, B. W. Brunton, S. L. Brunton (2022) Ensemble-SINDy: Robust sparse model discovery in the low-data, high-noise limit, with active learning and control. Proceedings of the Royal Society A
U. Fasel, J. N. Kutz, B. W. Brunton, S. L. Brunton (2021) SINDy with control: A tutorial. IEEE CDC
L. M. Gao, U. Fasel, S. L. Brunton. J. N. Kutz (2023) Convergence of uncertainty estimates in Ensemble and Bayesian sparse model discovery.arXiv 2301.12649
A. A. Kaptanoglu et al. (2021) PySINDy: A comprehensive Python package for robust sparse system identification. The Journal of Open Source Software