Program (2020)

Program overview

Videos

Invited Speakers

  • Lexing Ying (Stanford University), "Solving inverse problems with deep learning" [slides]

slides

This talk is about some recent progress on solving inverse problems using deep learning. Compared to traditional machine learning problems, inverse problems are often limited by the size of the training data set. We show how to overcome this issue by incorporating mathematical analysis and physics into the design of neural network architectures. We first describe neural network representations of pseudodifferential operators and Fourier integral operators. We then continue to discuss applications including electric impedance tomography, optical tomography, inverse acoustic/EM scattering, seismic imaging, and travel-time tomography.

  • Paris Perdikaris (University of Pennsylvania), "Understanding and mitigating gradient flow pathologies in physics-informed neural networks" [slides]

slides

The widespread use of neural networks across different scientific domains often involves constraining them to satisfy certain symmetries, conservation laws, or other domain knowledge. Such constraints are often imposed as soft penalties during model training and effectively act as domain-specific regularizers of the empirical risk loss. Physics-informed neural networks is an example of this philosophy in which the outputs of deep neural networks are constrained to approximately satisfy a given set of partial differential equations. In this work we review recent advances in scientific machine learning with a specific focus on the effectiveness of physics-informed neural networks in predicting outcomes of physical systems and discovering hidden physics from noisy data. We also identify and analyze a fundamental mode of failure of such approaches that is related to numerical stiffness in the gradient flow dynamics leading to unbalanced back-propagated gradients during model training via gradient descent. To address this limitation we present a learning rate annealing algorithm that utilizes gradient statistics during model training to balance the interplay between different terms in composite loss functions. We also propose a novel neural network architecture that is more resilient to such gradient pathologies. Taken together, our developments provide new insights into the training of constrained neural networks and consistently improve the predictive accuracy of physics-informed neural networks by a factor of 50-100x across a range of problems in computational physics.


Bio:

Paris Perdikaris is an Assistant Professor in the Department of Mechanical Engineering and Applied Mechanics at the University of Pennsylvania. His work spans a wide range of areas in computational science and engineering, with a particular focus on the analysis and design of complex physical and biological systems using machine learning, stochastic modeling, computational mechanics, and high-performance computing. Current research thrusts include physics-informed machine learning, uncertainty quantification in deep learning, engineering design optimization, and data-driven non-invasive medical diagnostics.


  • Maziar Raissi (University of Colorado, Boulder), "Hidden Physics Models: Machine Learning of Nonlinear Partial Differential Equations" [slides]

slides

A grand challenge with great opportunities is to develop a coherent framework that enables blending conservation laws, physical principles, and/or phenomenological behaviors expressed by differential equations with the vast data sets available in many fields of engineering, science, and technology. At the intersection of probabilistic machine learning, deep learning, and scientific computations, this work is pursuing the overall vision to establish promising new directions for harnessing the long-standing developments of classical methods in applied mathematics and mathematical physics to design learning machines with the ability to operate in complex domains without requiring large quantities of data. To materialize this vision, this work is exploring two complementary directions: (1) designing data-efficient learning machines capable of leveraging the underlying laws of physics, expressed by time dependent and non-linear differential equations, to extract patterns from high-dimensional data generated from experiments, and (2) designing novel numerical algorithms that can seamlessly blend equations and noisy multi-fidelity data, infer latent quantities of interest (e.g., the solution to a differential equation), and naturally quantify uncertainty in computations. The latter is aligned in spirit with the emerging field of probabilistic numerics.

  • Marco Pavone (Stanford University), "On Safe and Efficient Human-robot Interactions via Multimodal Intent Modeling and Reachability-based Safety Assurance"

In this talk I will present a decision-making and control stack for human-robot interactions by using autonomous driving as a motivating example. Specifically, I will first discuss a data-driven approach for learning multimodal interaction dynamics between robot-driven and human-driven vehicles based on recent advances in deep generative modeling. Then, I will discuss how to incorporate such a learned interaction model into a real-time, interaction-aware decision-making framework. The framework is designed to be minimally interventional; in particular, by leveraging backward reachability analysis, it ensures safety even when other cars defy the robot's expectations without unduly sacrificing performance. I will present recent results from experiments on a full-scale steer-by-wire platform, validating the framework and providing practical insights. I will conclude the talk by providing an overview of related efforts from my group on infusing safety assurances in robot autonomy stacks equipped with learning-based components, with an emphasis on adding structure within robot learning via control-theoretical and formal methods.


  • Stefano Ermon (Stanford University), "Bayesian Optimization and Machine Learning for Accelerating Experiments in the Physical Sciences"

Abstract: TBA


  • Kevin Carlberg (University of Washington), "Nonlinear model reduction: using machine learning to enable rapid simulation of extreme-scale physics models"

Physics-based modeling and simulation has become indispensable across many applications in science and engineering, ranging from autonomous-vehicle control to designing new materials. However, achieving high predictive fidelity necessitates modeling fine spatiotemporal resolution, which can lead to extreme-scale computational models whose simulations consume months on thousands of computing cores. This constitutes a formidable computational barrier: the cost of truly high-fidelity simulations renders them impractical for important time-critical applications (e.g., rapid design, control, real-time simulation) in engineering and science. In this talk, I will present several advances in the field of nonlinear model reduction that leverage machine-learning techniques ranging from convolutional autoencoders to LSTM networks to overcome this barrier. In particular, these methods produce low-dimensional counterparts to high-fidelity models called reduced-order models (ROMs) that exhibit 1) accuracy, 2) low cost, 3) physical-property preservation, 4) guaranteed generalization performance, and 5) error quantification.


Accepted Papers

The final version of the extended abstract and short papers will be published in open-access CUER Workshop Proceedings (http://ceur-ws.org/).

A. Collins et al., A 2D Fully Convolutional Neural Network for Nearshore and Surf-Zone Bathymetry Inversion from Synthetic Imagery of The Surf-Zone using the Model Celeris [slides]

slides

Adam Collins1, Katherine L. Brodie2, Spicer Bak2, Tyler Hesser3, Matthew W. Farthing3, Douglas W. Gamble1, and Joseph W. Long1

1. University of North Carolina at Wilmington, Earth and Ocean Sciences, Wilmington, NC2. U.S. Army Engineer Research and Development Center, Coastal and Hydraulics Laboratory, Duck, NC3. U.S. Army Engineer Research and Development Center, Coastal and Hydraulics Laboratory, Vicksburg, MS

Bathymetry has a first order impact on nearshore and surfzone hydrodynamics. Typical survey techniques are expensive and time-consuming, require specialized equipment, and are not feasible in a variety of situations (e.g. limited manpower and/or site access). However, the emergence of nearshore remote sensing platforms (e.g. Unmanned Aircraft Systems (UAS), towers, and satellites) from which high-resolution imagery of the sea-surface can be collected at frequent intervals, has created the potential for accurate bathymetric estimation from wave-inversion techniques without in-situ measurements. While a variety of physics-based algorithms have been applied to nearshore and surfzone bathymetric inversion problems, the commonly used approaches do not account for non-linear hydrodynamics that are prevalent during breaking waves. Models for estimating non-linear wave dynamics are slow and often require large amounts of computational power which make them unfeasible for rapid estimations of depth. Fully convolutional neural networks (FCNs) are a branch of artificial intelligence algorithms that have proven effective at computer vision tasks in semantic segmentation and regression problems. In this work, we consider the use of FCNs for inferring bathymetry from video-derived imagery. The FCN model presented shows the feasibility of using an AI system to perform bathymetric inversion on time-averaged images (timex) of realistic-looking, synthetically generated surfzone imagery from the hydrodynamic wave model Celeris (Tavakkol and Lynett 2017). Ongoing work includes extending the FCN to incorporate synthetic video frames as input as well as testing with actual tower and satellite imagery.


M. D'Elia et al., Nonlocal Physics-Informed Neural Networks - A Unified Theoretical and Computational Framework for Nonlocal Models [slides]

slides

Marta D'Elia1, George E. Karniadakis2, Guofei Pang2, and Michael L. Parks1

1. Center for Computing Research, Sandia National Laboratories2. Applied Mathematics Department, Brown University

Nonlocal models provide an improved predictive capability thanks to their ability to capture effects that classical partial differential equations fail to capture. Among these effects we have multiscale behavior and anomalous behavior such as super- and sub-diffusion. These models have become incredibly popular for a broad range of applications, including mechanics, subsurface flow, turbulence, plasma dynamics, heat conduction and image processing. However, their improved accuracy comes at a price of many modeling and numerical challenges. In this work we focus on the estimation of model parameters, often unknown, or subject to noise. In particular, we address the problem of model identification in presence of sparse measurements. Our approach to this inverse problem is based on the combination of 1. Machine Learning and Physical Principles and 2. a Unified Nonlocal Vector Calculus and Versatile Surrogates such as neural networks (NN). The outcome is a flexible tool that allows us to learn existing and new nonlocal operators. We refer to our technique as nPINNs (nonlocal Physics-Informed Neural Networks); here, we model the nonlocal solution with a NN and we solve an optimization problem where we minimize the residual of the nonlocal equation and the misfit with measured data. The result of the optimization are the weights and biases of the NN and the set of unknown model parameters.


M. Di Giovanni et al., Finding Multiple Solutions of ODEs with Neural Networks

Marco Di Giovanni1, David Sondak2, Pavlos Protopapas2, and Marco Brambilla1

1. Politecnico di Milano. Dipartimento di Elettronica, Informazione e Bioingegneria (DEIB)2. Institute for Applied Computational Science, Harvard University

Applications of neural networks to numerical problems have gained increasing interest. Among different tasks, finding solutions of ordinary differential equations (ODEs) is one of the most intriguing problems, as there may be advantages over well established and robust classical approaches. In this paper, we propose an algorithm to find all possible solutions of an ordinary differential equation that has multiple solutions, using artificial neural networks. The key idea is the introduction of a new loss term that we call the interaction loss. The interaction loss prevents different solutions families from coinciding. We carried out experiments with two nonlinear differential equations, all admitting more than one solution, to show the effectiveness of our algorithm, and we performed a sensitivity analysis to investigate the impact of different hyper-parameters.


A. H. Hosseinloo and M. Dahleh, Event-Triggered Reinforcement Learning; An Application to Buildings’ Micro-Climate Control

Ashkan Haji Hosseinloo and Munther Dahleh

Laboratory for Information and Decision Systems, MIT, USA

Smart buildings have great potential for shaping an energy-efficient, sustainable, and more economic future for our planet as buildings account for approximately 40% of the global energy consumption. However, most learning methods for micro-climate control in buildings are based on Markov Decision Processes with fixed transition times that suffer from high variance in the learning phase. Furthermore, ignoring its continuing-task nature the micro-climate control problem is often modeled and solved as an episodic-task problem with discounted rewards. This can result in a wrong optimization solution. To overcome these issues we propose an event-triggered learning control and formulate it based on Semi-Markov Decision Processes with variable transition times and in an average-reward setting. We show via simulation the efficacy of our approach in controlling the micro-climate of a single-zone building.


L. Lu et al., DeepXDE: A Deep Learning Library for Solving Differential Equations [slides]

slides

Lu Lu, Xuhui Meng, Zhiping Mao and George Em Karniadakis

Division of Applied Mathematics, Brown University

Deep learning has achieved remarkable success in diverse applications; however, its use in solving partial differential equations (PDEs) has emerged only recently. Here, we present an overview of physics-informed neural networks (PINNs), which embed a PDE into the loss of the neural network using automatic differentiation. PINNs solve inverse problems similarly to forward problems. We also present a Python library for PINNs, DeepXDE. DeepXDE supports complex-geometry domains based on the technique of constructive solid geometry, and enables the user code to be compact, resembling closely the mathematical formulation. We introduce the usage of DeepXDE, and we also demonstrate the capability of PINNs and the user-friendliness of DeepXDE for two different examples.


L. Ma et al., A Weighted Sparse-Input Neural Network Technique Applied to Identify Important Features for Vortex-Induced Vibration [slides]

slides

Leixin Ma, Themistocles L. Resvanis, J. Kim Vandiver

Department of Mechanical Engineering, Massachusetts Institute of Technology

Flow-induced vibration depends on a large number of parameters or features. On the one hand, the number of candidate physical features may be too big to construct an interpretable and transferrable model. On the other hand, failure to account for key dependence among features may oversimplify the model. Feature selection is found to be able to reduce the dimension of the physical problem by identifying the most important features for a certain prediction task. In this paper, a weighted sparse-input neural network (WSPINN) is proposed, where the prior physical knowledge is leveraged to constrain the neural network optimization. The effectiveness of this approach is evaluated when applied to the vortex-induced vibration of a long flexible cylinder with Reynolds number from 104 to 105. The important physical features affecting the flexible cylinders’ crossflow vibration amplitude are identified.


A. Mehta et al., Physics-Informed Spatiotemporal Deep Learning for Emulating Coupled Dynamical Systems [slides]

slides

Anishi Mehta1,3, Cory Scott2,3, Diane Oyen3, Nishant Panda3, and Gowri Srinivasan3

1. Georgia Institute of Technology2. University of California-Irvine3. Los Alamos National Laboratory

Accurately predicting the propagation of fractures, or cracks, in brittle materials is an important problem in evaluating the reliability of objects such as airplane wings and concrete structures. Efficient crack propagation emulators that can run in a fraction of the time of high-fidelity physics simulations are needed. A primary challenge of modeling fracture networks and the stress propagation in materials is that the cracks themselves introduce discontinuities, making existing partial differential equation (PDE) discovery models unusable. Furthermore, existing physics-informed neural networks are limited to learning PDEs with either constant initial conditions or changes that do not depend on the PDE outputs at the previous time. In fracture propagation, at each time-step, there is a damage field and a stress field; where the stress causes further damage in the material. The stress field at the next time step is affected by the discontinuities introduced by the propagated damage. Thus, both stress and damage fields are heavily dependent on each other; which makes modeling the system difficult. Spatiotemporal LSTMs have shown promise in the area of real-world video prediction. Building on this success, we approach this physics emulation problem as a video generation problem: training the model on simulation data to learn the underlying dynamic behavior. Our novel deep learning model is a Physics-Informed Spatiotemporal LSTM, that uses modified loss functions and partial derivatives from the stress field to build a data-driven coupled dynamics emulator. Our approach outperforms other neural net architectures at predicting subsequent frames of a simulation, enabling fast and accurate emulation of fracture propagation.


M. Mudunuru et al., Physics-Informed Machine Learning for Real-time Reservoir Management [slides]

slides

Maruti K. Mudunuru1, Daniel O’Malley1, Shriram Srinivasan1, Jeffrey D. Hyman1, Matthew R. Sweeney1, Luke Frash1, Bill Carey1, Michael R. Gross1, Nathan J. Welch1, Satish Karra1, Velimir V. Vesselinov1, Qinjun Kang1, Hongwu Xu1, Rajesh J. Pawar1, Tim Carr2, Liwei Li2, George D. Guthrie1, and Hari S. Viswanathan1

1. Earth and Environmental Sciences Division, Los Alamos National Laboratory, Los Alamos2. Department of Geology & Geography, West Virginia University, Morgantown, WV

We present a physics-informed machine learning (PIML) workflow for real-time unconventional reservoir management. Reduced-order physics and high-fidelity physics model simulations, lab-scale and sparse field-scale data, and machine learning (ML) models are developed and combined for real-time forecasting through this PIML workflow. These forecasts include total cumulative production (e.g., gas, water), production rate, stage-specific production, and spatial evolution of quantities of interest (e.g., residual gas, reservoir pressure, temperature, stress fields). The proposed PIML workflow consists of three key ingredients: (1) site behavior libraries based on fast and accurate physics, (2) ML-based inverse models to refine key site parameters, and (3) a fast forward model that combines physical models and ML to forecast production and reservoir conditions. First, synthetic production data from multi-fidelity physics models are integrated to develop the site behavior library. Second, ML-based inverse models are developed to infer site conditions and enable the forecasting of production behavior. Our preliminary results show that the ML-models developed based on PIML workflow have good quantitative predictions (>90% based on R2-score). In terms of computational cost, the proposed ML-models are O(104) to O(107) times faster than running a high-fidelity physics model simulation for evaluating the quantities of interest (e.g., gas production). This low computational cost makes the proposed ML-models attractive for real-time history matching and forecasting at shale-gas sites (e.g., MSEEL – Marcellus Shale Energy and Environmental Laboratory) as they are significantly faster yet provide accurate predictions.


A. Nikolaev et al., Deep Learning for Climate Models of the Atlantic Ocean

Anton Nikolaev1, Ingo Richter2, Peter Sadowski1

1. Information and Computer Sciences, University of Hawai‘i at M¯anoa2. Japan Agency for Marine-Earth Science and Technology

A deep neural network is trained to predict sea surface temperature variations at two important regions of the Atlantic ocean, using 800 years of simulated climate dynamics based on the first-principles physics models. This model is then tested against 60 years of historical data. Our statistical model learns to approximate the physical laws governing the simulation, providing significant improvement over simple statistical forecasts and comparable to most state-of-the-art dynamical/conventional forecast models for a fraction of the computational cost.


Y. Qian et al., Surfzone Topography-informed Deep Learning Techniques to Nearshore Bathymetry with Sparse Measurements

Yizhou Qian1, Hojat Ghorbanidehno2, Matthew Farthing3, Ty Hesser3, Peter K. Kitanidis1,4, and Eric F. Darve1,5

1. Institute for Computational and Mathematical Engineering, Stanford University2. Cisco Systems3. US Army Engineer Research and Development Center, Vicksburg, MS4. Department of Civil and Environmental Engineering, Stanford University5. Department of Mechanical Engineering, Stanford University


Nearshore bathymetry, the knowledge of water depth in coastal zones, has played a vital role in a wide variety of applications including shipping operations, coastal management, and risk assessment. However, direct high resolution surveys of nearshore bathymetry are relatively difficult to perform due to budget constraints and logistical restrictions. One possible approach to nearshore bathymetry without such limitations is the use of spatial interpolation with sparse measurements of water depth by using, for example, geostatistics. However, it is often difficult for traditional methods to recognize patterns with a sharp gradient often shown on coastal sand bars, especially in the case of sparse measurements. In this work, we use a conditional Generative Adversarial Neural Network (cGAN) to generate abruptly changing bathymetry samples while being consistent with our sparse, multi-scale measurements. We train our neural network based on synthetic data generated from nearshore surveys provided by the U.S. Army Corps of Engineer Field Research Facility (FRF) in Duck, North Carolina. We compare our method with Kriging on real surveys as well as ones with artificially added patterns of sharp gradient. Results show that our conditional Generative Adversarial Network provides estimates with lower root mean squared errors than Kriging in both cases.


B. Quach et al., Deep Sensing of Ocean Wave Heights with Synthetic Aperture Radar

Brandon Quach1,2, Yannik Glaser2, Justin Stopa3, and Peter Sadowski2

1. Computing and Mathematical Sciences, California Institute of Technology2. Information and Computer Sciences, University of Hawaii at Manoa3. Ocean Resources and Engineering, University of Hawaii at Manoa

The Sentinel-1 satellites equipped with synthetic aperture radars (SAR) provide near global coverage of the world’s oceans every six days. We curate a data set of co-locations between SAR and altimeter satellites, and investigate the use of deep learning to predict significant wave height from SAR. While previous models for predicting geophysical quantities from SAR rely heavily on feature-engineering, our approach learns directly from low-level image cross-spectra. Training on co-locations from 2015-2017, we demonstrate on test data from 2018 that deep learning reduces the state-of-the-art root mean squared error by 50%, from 0.6 meters to 0.3 meters.


C. Rackauckas et al., Generalized Physics-Informed Learning through Language-Wide Differentiable Programming [slides]

slides

Chris Rackauckas1,2, Alan Edelman1,3, Keno Fischer3, Mike Innes3, Elliot Saba3, Viral B. Shah3, and Will Tebbutt4

1. Massachusetts Institute of Technology 2. University of Maryland, Baltimore3. Julia Computing4. University of Cambridge

Scientific computing is increasingly incorporating the advancements in machine learning to allow for data-driven physics-informed modeling approaches. However, re-targeting existing scientific computing workloads to machine learning frameworks is both costly and limiting, as scientific simulations tend to use the full feature set of a general purpose programming language. In this manuscript we develop an infrastructure for incorporating deep learning into existing scientific computing code through Differentiable Programming (∂P). We describe a ∂P system that is able to take gradients of full Julia programs, making Automatic Differentiation a first class language feature and compatibility with deep learning pervasive. Our system utilizes the one-language nature of Julia package development to augment the existing package ecosystem with deep learning, supporting almost all language constructs (control flow, recursion, mutation, etc.) while generating high-performance code without requiring any user intervention or refactoring to stage computations. We showcase several examples of physics-informed learning which directly utilizes this extension to existing simulation code: neural surrogate models, machine learning on simulated quantum hardware, and data-driven stochastic dynamical model discovery with neural stochastic differential equations.

P. Stinis, Enforcing Constraints for Time Series Prediction in Supervised, Unsupervised and Reinforcement Learning [slides]

slides

Panos Stinis

Advanced Computing, Mathematics and Data Division, Pacific Northwest National Laboratory

We assume that we are given a time series of data from a dynamical system and our task is to learn the flow map of the dynamical system. We present a collection of results on how to enforce constraints coming from the dynamical system in order to accelerate the training of deep neural networks to represent the flow map of the system as well as increase their predictive ability. In particular, we provide ways to enforce constraints during training for all three major modes of learning, namely supervised, unsupervised and reinforcement learning. In general, the dynamic constraints need to include terms which are analogous to memory terms in model reduction formalisms. Such memory terms act as a restoring force which corrects the errors committed by the learned flow map during prediction.


For supervised learning, the constraints are added to the objective function. For the case of unsupervised learning, in particular generative adversarial networks, the constraints are introduced by augmenting the input of the discriminator. Finally, for the case of reinforcement learning and in particular actor-critic methods, the constraints are added to the reward function. In addition, for the reinforcement learning case, we present a novel approach based on homotopy of the action-value function in order to stabilize and accelerate training. We use numerical results for the Lorenz system to illustrate the various constructions.


M. Tavakoli and P. Baldi., Continuous Representation Of Molecules using Graph Variational Autoencoder [slides]

slides

Mohammadamin Tavakoli and Pierre Baldi

Department of Computer Science, University of California, Irvine

In order to continuously represent molecules, we propose a generative model in the form of a VAE which is operating on the 2D-graph structure of molecules. A side predictor is employed to prune the latent space and help the decoder in generating meaningful adjacency tensor of molecules. Other than the potential applicability in drug design and property prediction, we show the superior performance of this technique in comparison to other similar methods based on the SMILES representation of the molecules with RNN based encoder and decoder.


N. Trask et al., GMLS-Nets: A Machine Learning Framework for Unstructured Data [slides]

slides

Nathaniel Trask1, Ravi Patel1, Paul Atzberger2 and Ben Gross2

1. Center for Computing Research, Sandia National Laboratories 2. University of California Santa Barbara

Data fields sampled on irregularly spaced points arise in many applications in the sciences and engineering. For regular grids, Convolutional Neural Networks (CNNs) have been successfully used to gain benefits from weight sharing and invariances. We generalize CNNs by introducing methods for data on unstructured point clouds based on Generalized Moving Least Squares (GMLS). GMLS is a non-parametric meshfree technique for estimating linear bounded functionals from scattered data, and has recently emerged as an effective technique for solving partial differential equations. By parameterizing the GMLS estimator, we obtain learning methods for linear and non-linear operators with unstructured stencils. In GMLS-Nets the necessary calculations are local, readily parallelizable, and the estimator is supported by a rigorous approximation theory. We show how the framework may be used for unstructured physical data sets to perform functional regression to identify associated differential operators, develop predictive dynamical models, and to obtain feature extractors to predict quantities of interest. The results show the promise of these architectures as foundations for data-driven model development in scientific machine learning applications.


H. Yoon et al., Permeability Prediction of Porous Media using Convolutional Neural Networks with Physical Properties [slides]

slides

Hongkyu Yoon1, Darryl Melander2, and Stephen J. Verzi2

1. Geomechanics Department, Sandia National Laboratories, Albuquerque2. Complex System for National Security, Sandia National Laboratories, Albuquerque

Permeability prediction of porous media system is very important in many engineering and science domains including earth materials, bio-, solid-materials, and energy applications. In this work we evaluated how machine learning can be used to predict the permeability of porous media with physical properties. An emerging challenge for machine learning/deep learning in engineering and scientific research is the ability to incorporate physics into machine learning process. We used convolutional neural networks (CNNs) to train a set of image data of bead packing and additional physical properties such as porosity and surface area of porous media are used as training data either by feeding them to the fully connected network directly or through the multilayer perception network. Our results clearly show that the optimal neural network architecture and implementation of physics-informed constraints are important to properly improve the model prediction of permeability. A comprehensive analysis of hyperparameters with different CNN architectures and the data implementation scheme of the physical properties need to be performed to optimize our learning system for various porous media system.


K. Xu and E. Darve, Data-Driven Inverse Modeling with Incomplete Observations [slides]

slides

Kailai Xu1 and Eric Darve1,2

1. Institute for Computational and Mathematical Engineering, Stanford University2. Mechanical Engineering, Stanford University

Deep neural networks (DNN) have been used to model nonlinear relations between physical quantities. Those DNNs are embedded in physical systems described by partial differential equations (PDE) and trained by minimizing a loss function that measures the discrepancy between predictions and observations in some chosen norm. This loss function often includes the PDE constraints as a penalty term when only sparse observations are available. As a result, the PDE is only satisfied approximately by the solution. However, the penalty term typically slows down the convergence of the optimizer for stiff problems. We present a new approach that trains the embedded DNNs while numerically satisfying the PDE constraints. We develop an algorithm that enables differentiating both explicit and implicit numerical solvers in reverse-mode automatic differentiation. This allows the gradients of the DNNs and the PDE solvers to be computed in a unified framework. We demonstrate that our approach enjoys faster convergence and better stability in relatively stiff problems compared to the penalty method. Our approach allows for the potential to solve and accelerate a wide range of data-driven inverse modeling, where the physical constraints are described by PDEs and need to be satisfied accurately.

Schedule


AAAIMLPS_schedule_virtual_final.pdf