Physics Informed Machine Learning for Data-Driven Discovery of Reduced Lagrangian Models of Turbulence

Turbulence in fluids is a ubiquitous phenomenon and obtaining efficient and accurate reduced order models remains an active research topic due to its many potential impacts on science and technology. Fluid turbulence is characterized by strong coupling across a broad range of scales, and currently requires advanced numerical methods for obtaining approximate solutions. However, the computational cost of explicitly resolving the full set of scales (spatial and temporal) with Direct Numerical Simulation (DNS) of high-Reynolds number Navier-Stokes equations is extremely expensive and is often prohibitive in applications. This motivates the broader goal of this work, which is primarily focused on developing data-driven techniques for discovering a reduced description of turbulent flows characterized by a shorter range of resolved scales (affordable computationally) combined with the ability of modeling subgrid scales. While popular reduced models, such as Large Eddy Simulations (LES) or Reynolds-Averaged Navier-Stokes are typically based on phenomenological modeling of relevant turbulent processes, recent advances in machine learning techniques have inspired efforts to further improve such reduced models and offer methods for obtaining new ones.

Recently, classical CFD (and more generally numerical) simulators have been blending with modern machine learning tools out of which a promising field is emerging (and being re-discovered) that offers new and exciting tools for scientists and engineers to discover physical phenomena/models from data. This emerging field, PIML, is incorporating well known physical structure (such as conservation laws and numerical algorithms) into modern machine learning tools, in order to merge the centuries of scientific knowledge with modern machine learning techniques. The overall goals of PIML is two fold; improving modern ML algorithms by increasing interperatibility and generalizability, along with the hopes of learning new physics and useful models from data.

In general, There are several approaches that can be taken to develop a PIML algorithm; (1) directly adding physical structure into the loss function (as was done in where Neural Networks are used to discover (or solve) parameterized PDEs from data), (2) enforce physical constraints directly into the NN architecture (as in Mohan et. al. to enforce incompressiblility constraints into a Convolutional Neural Network), (3) utilize NN's as function approximators along with physical parameters/structure embedded within a numerical simulator and the learning algorithm (as was done Neural ODE). This work has primarily focused on (3) as an approach to build parameterized Lagrangian models as candidates for reduced models of turbulence. We combine these scientific disciplines to not only provide a tool for approaching the discovery of optimal models of SPH for turbulence, but also to explore the effects of adding physical structure into ML algorithms on generalizability.

Smooth Particle Hydrodynamics (SPH) is a mesh-free particle based Lagrangian method for obtaining approximate numerical solutions of the equations of fluid dynamics, which has been widely applied to weakly- and strongly compressible turbulence in astrophysics and engineering applications. In this work, we approach a reduced Lagrangian model for turbulence by developing a hierarchy of parameterized and learn-able SPH models that are trained and analyzed on "synthetic" SPH data. These parameterized SPH simulators are used to solve inverse problems (learned) by mixing automatic differentiation (both forward and reverse mode) with forward and adjoint based sensitivity analyses. We show that this physics informed learning method is capable of: (a) solving inverse problems over both the physically interperatable parameter space, as well as over the space of Neural Network functions; (b) learning Lagrangian statistics of turbulence; (c) combining trajectory based, probabilistic, and field based loss functions; and (d) extrapolating beyond training sets into more complex regimes of interest. Furthermore, this hierarchy of models gradually introduces more physical structure which we show improves the generalizability over different time scales and Reynolds numbers.