Machine Learning for PDEs & Inverse Problems

Overview

Machine learning has recently transformed the field of scientific computing, resulting in scientific machine learning as an emerging field. Numerical algorithms for solving partial differential equations (PDEs) based on ideas and techniques adapted from artificial intelligence (AI) and data science tremendously broadened the spectrum of applications where this task became feasible in high dimensions and offered new opportunities for traditional applications. Data-driven numerical solvers for inverse problems have become application-dependent with optimized computational efficiency and accuracy, enabling new opportunities for various applications.

DNN Parametrization for PDE Solutions

Our goal is to develop systematic DNN-based nonlinear PDE solvers including eigenvalue problems in various dimensions with the capacity to identify distinct solutions. Solving high-dimensional and nonlinear PDEs has been a challenging problem in numerous applications (e.g., finance, control, quantum chemistry, and molecular dynamics for biological functioning). The main difficulties are the curse of dimensionality to discretize PDEs and the nonlinearity leading to spurious solutions. Though empirical successes of deep learning for this problem have been reported, fast algorithms and machine accuracy are still not available. Due to the implicit bias of SGD and DNNs, these solvers usually can only find one solution even if the PDE has distinct solutions. These drawbacks motivate us to solve these issues by incorporating physics, structures, and conventional algorithms into deep learning.

In low dimensions, leveraging the power of deep learning, we developed efficient preconditioners for nonlinear PDEs [pdf], enabling the super-convergence of conventional iterative methods. Traditional Newton's method, Picard's method, and the two-grid method fail while the proposed method works efficiently. In [pdf], we proposed structure-probing neural network deflation (NND) to make deep learning capable of identifying multiple solutions of nonlinear PDEs that are ubiquitous and important in nonlinear models. In low dimensions, NND can identify more solutions than traditional deflation techniques; in high dimensions, NND can still work efficiently while traditional methods cannot. In high dimensions, we have introduced self-paced learning to PDE solvers, which applies a DNN to automatically fulfill importance sampling, obtaining better robustness for solutions with irregularity [pdf] than existing algorithms in various dimensions from 10 to 100 in elliptic, hyperbolic, and parabolic PDEs. By leveraging Friedrichs' min-max framework and deep learning, we have introduced weak formulation to DNN-based PDE solvers [pdf], which can learn the solutions of PDEs with discontinuity. More recently, motivated by spectral methods and wavelet analysis, we have introduced advanced activation functions in deep learning to lessen the spectral bias of DNNs, achieving high accuracy in identifying oscillatory solutions to PDEs [pdf] (1e-3 to 1e-7 relative errors).

Operator Learning

Nonlinear operator learning aims to learn a map from a parametric function space to the solution space of certain partial differential equation (PDE) problems. It has become an important topic for many fields with wide applications, including order reduction, parametric PDEs, inverse problems, and imaging problems. As deep neural networks (DNNs) become state-of-the-art models in various machine learning tasks, its tremendous success has drawn attention to their applications to engineering problems, where PDEs have been the dominating model for decades. Thus deep operator learning naturally arises as a powerful tool for nonlinear PDE operator learning. A typical method is to first discretize the computational domain and represents functions as a vector that tabulates the function values on the discretized mesh. Then the DNN is employed to learn a map between finite dimensional spaces. Although this method has been successful in many applications, the computational cost is expensive due to the mesh-dependent nature of this method, that is, the DNN needs to be trained again if a different discretization is used. To tackle this issue, we have proposed novel discretization-invariant operator learning techniques based on integral autoencoders for solving (parametric) PDEs, inverse problems, image/signal processing problems [pdf].

Despite the empirical success of deep operator learning in many applications, the statistical learning theory is very limited, especially when the ambient space is infinite-dimensional. Generally speaking, the learning theory consists of three parts: the approximation theory that quantifies the expressibility of various DNNs as a surrogate for a class of operators, the optimization theory that analyzes various optimization schemes and non-convexity nature of the optimization task, and the generalization theory that assesses the discrepancy when only finite many training data is available. We aim at investigating why deep operator learning lessens the curse of dimensionality (CoD) for PDE-related problems. The key observation is that most PDE operators admit a structure of compositions of linear transformations and entrywise nonlinear transformations of few inputs. Such a structure can be efficiently learned by DNNs due to its pointwise network evaluation nature. We provide an error analysis of the approximation and generalization errors, and apply it to various PDE problems to reveal to what extent can the CoD be mitigated [pdf, pdf].

Integral Autoencoder Network

Operator Learning Network