Research

My research interests are very broad and include mathematics of data science and deep learning, computational and applied mathematics with applications to Partial Differential Equations (PDEs), optimal transport and computational microscopy.

Inspired by the work of Raissi, Perdikaris and Karniadakis where the authors develop a deep learning framework for forward and inverse problems and by the work of Finlay, Calder, Abbasi, Oberman on neural network regularization, I have developed mathematically-principled deep learning algorithms for forward and inverse problems that are completely data-driven, robust to noise and that can be used to solve a variety of real world problems coming from areas such as economics, biology and finance. You can find my doctoral thesis here.

My main focus currently is developing Large Language Models for forward and inverse differential equation problems.

I have also collaborated to works on Physics-Informed machine learning, parameter estimation using optimal transport and Harnack estimates on metric measure spaces. I also have active collaborations on sparse deep learning methods for applied Partial Differential Equations (PDEs), parameter estimation and image classification using optimal transport, and computational microscopy.

Most of my work uses modern computational platforms involving parallel and GPU computing.

Below you can find detailed descriptions and links to my recent papers.

Please don't hesitate to contact me with any questions!

In this paper we use neural networks to learn governing equations from data. Specifically we reconstruct the right-hand side of a system of ODEs  x'(t) = f(t, x(t)) directly from observed uniformly time-sampled data using a neural network. In contrast with other neural network-based approaches to this problem, we add a Lipschitz regularization term to our loss function. In the synthetic examples we observed empirically that this regularization results in a smoother approximating function and better generalization properties when compared with non-regularized models, both on trajectory and non-trajectory data, especially in presence of noise. In contrast with sparse regression approaches, since neural networks are universal approximators, we do not need any prior knowledge on the ODE system. Since the model is applied component wise, it can handle systems of any dimension, making it usable for real-world data.

We present a new algorithm for learning unknown governing equations from trajectory data, using neural networks. Given samples of solutions x(t) to an unknown dynamical system x'(t) = f(t, x(t)), we approximate the function f using an ensemble of neural networks. We express the equation in integral form and use Euler method to predict the solution at every successive time step using at each iteration a different neural network as a prior for f. This procedure yields M-1 time-independent networks, where M is the number of time steps at which x(t) is observed. Finally, we obtain a single function f(t, x(t)) by neural network interpolation. Unlike our earlier work, where we numerically computed the derivatives of data, and used them as target in a Lipschitz regularized neural network to approximate f, our new method avoids numerical differentiations, which are unstable in presence of noise.  We empirically show that generalization and recovery of the governing equation improve by adding a Lipschitz regularization term in our loss function and that this method improves our previous one especially in presence of noise, when numerical differentiation provides low quality target data. Finally, we compare our results with the multistep method proposed by Raissi, et al. arXiv:1801.01236 (2018) and with SINDy.

Parameter identification determines the essential system parameters required to build real-world dynamical systems by fusing crucial physical relationships and experimental data. However, the data-driven approach faces main difficulties, such as a lack of observational data, discontinuous or inconsistent time trajectories, and noisy measurements. The ill-posedness of the inverse problem comes from the chaotic divergence of the forward dynamics. Motivated by the challenges, we shift from the Lagrangian particle perspective to the state space flow field's Eulerian description. Instead of using pure time trajectories as the inference data, we treat statistics accumulated from the Direct Numerical Simulation (DNS) as the observable, whose continuous analog is the steady-state probability density function (PDF) of the corresponding Fokker--Planck equation (FPE). We reformulate the original parameter identification problem as a data-fitting, PDE-constrained optimization problem. An upwind scheme based on the finite-volume method that enforces mass conservation and positivity preserving is used to discretize the forward problem. We present theoretical regularity analysis for evaluating gradients of optimal transport costs and introduce three different formulations for efficient gradient calculation. Numerical results using the quadratic Wasserstein metric from optimal transport demonstrate this novel approach's robustness for chaotic dynamical system parameter identification. 

Applications of No-Collision Transportation Maps in Manifold Learning

Link to paper: https://arxiv.org/abs/2304.00199 , (accepted in SIAM Journal of Mathematics of Data Science) in collaboration with Levon Nurbekyan (UCLA)

In this work, we investigate applications of no-collision transportation maps introduced in [Nurbekyan et. al., 2020] in manifold learning for image data. Recently, there has been a surge in applying transportation-based distances and features for data representing motion-like or deformation-like phenomena. Indeed, comparing intensities at fixed locations often does not reveal the data structure. No-collision maps and distances developed in [Nurbekyan et. al., 2020] are sensitive to geometric features similar to optimal transportation (OT) maps but much cheaper to compute due to the absence of optimization. In this work, we prove that no-collision distances provide an isometry between translations (respectively dilations) of a single probability measure and the translation (respectively dilation) vectors equipped with a Euclidean distance. Furthermore, we prove that no-collision transportation maps, as well as OT and linearized OT maps, do not in general provide an isometry for rotations. The numerical experiments confirm our theoretical findings and show that no-collision distances achieve similar or better performance on several manifold learning tasks compared to other OT and Euclidean-based methods at a fraction of a computational cost.

Figure: Distance matrix comparison for translation example on a triangular grid. Top: Distance matrices obtained using Linearized Optimal Transport (LOT) and no collision maps with centers of mass or geometric centers. Bottom: Squared Error between approximate distance matrices and Wasserstein distance 

LoDIP: Low light phase retrieval with deep image prior

Link to paper: NeurIPS Workshop ML4PS, in collaboration with Jianwei Miao (UCLA), Stanley Osher (UCLA), Minh Pham (UCLA), and Daniel Jacobs (UCLA)

During my time at IPAM as a core participant in the Computational Microscopy Program, I have also started a collaboration with with Jianwei Miao (UCLA), Stanley Osher (UCLA), Minh Pham (UCLA), and Daniel Jacobs (UCLA) on image reconstruction and phase retrieval for biological applications.

In this work we propose to design a Convolutional Neural Network (CNN) to perform in situ coherent diffractive imaging (in situ CDI) for real-time observation of dynamic processes during data acquisition as well as provide a mathematical explanation for the superior results obtained by in situ CDI compared with CDI in the low-dose regime.

Iterative methods for in situ CDI, while powerful, require tuning of several algorithmic parameters and expert strategies. Inspired by the 2022 work of Chang et al. we propose to replace iterative algorithms for in situ CDI with deep learning based methods. In particular we use a CNN trained using only simulated data and exploit the time-invariant static region between the measured diffraction patterns as a powerful real-space constraint; we expect this will result in faster and more robust convergence.

Once fully trained, the CNN can perform real-time phase reconstructions of the structure and dynamics of radiation-sensitive biological materials and would provide a novel, more accurate and fast alternative to iterative algorithms for phase retrieval for applications in medicine, biology and physics.

Figure: Experimental Results on simulated data.Top: Reconstruction at high photon count (80 photon/pixel). Bottom: Reconstruction at low-photon counts (8 photon/pixel). For each method we report the peak signal-to-noise ratio (PSNR), larger the better. Each image shows a zoomed-in view of only the sample region. The first two methods(HIO and DIP) use the conventional CDI setup (no static region) whereas HIO-stat, GPS and LoDIP use the LoDIP setup (with static region).

Computational Microscopy beyond Perfect Lenses

Link to paper: "Computational Microscopy beyond Perfect Lenses", in collaboration with Xingyuan Lu (UCLA, Soochow University), Minh Pham (UCLA), Damek Davis (Cornell University), Stanley J. Osher (UCLA), Jianwei Miao (UCLA)

We demonstrate through mathematical analysis and numerical experiments that in situ coherent diffractive imaging (CDI), which harnesses the coherent interference between a strong and a weak beam illuminating a static and dynamic structure, respectively, could be the most dose-efficient imaging method. At low doses, in situ CDI can achieve higher resolution than perfect lenses with the point spread function as a delta function. We also use numerical experiments to show that the combination of in situ CDI and ptychography can reduce the dose by two orders of magnitude over ptychography under some conditions. We expect that computational microscopy based on in situ CDI can be implemented in different imaging modalities with photons and electrons for low-dose imaging of radiation-sensitive materials and biological samples.

Figure: Numerical experiments on phase-contrast microscopy with perfect lenses (PLs) and in situ CDI (iCDI). (a)-(d) Representative images of a 300-nm-thick dynamic biological vesicle obtained 16 by phase-contrast microscopy with perfect lenses using an x-ray fluence of 3.5e5, 3.5e6, 3.5e7, and 3.5e8 photons/μm2 , respectively, corresponding to a dose of 2.75e4, 2.75e5, 2.75e6, and 2.75e7 Gy, respectively. (e)-(h) The same images reconstructed by in situ CDI with an x-ray fluence of 3.5e5, 3.5e6, 3.5e7, and 3.5e8 photons/μm2 , respectively, and a fixed fluence of 1.4e11 photons/μm2 on the static structure 

Tensorization of Sobolev Spaces

Link to paper: "Tensorization of Sobolev Spaces", in collaboration with Silvia Ghinassi (University of Washington), Vikram Giri (ETH)

We prove that Sobolev spaces on Cartesian and warped products of metric spaces tensorize, when one of the factor is a doubling space supporting a Poincar´e inequality. 

Solving Hamiltonian Chaotic Dynamical Systems With Physics Informed Machine Learning

Link to paper: link to preliminary report, in collaboration with Michael Minion (LBNL), Dmitriy Morozov (LBNL)

In this project we design a feed forward neural network to solve the Double Pendulum equation. We choose the Double Pendulum since it is one of the simplest chaotic Hamiltonian systems with known Hamiltonian and conserved quantities. In the experiments we investigate how the accuracy of the approximated solution changes when changing the time interval in which we sample the original data, when increasing the number of layers in the network and when introducing Hamiltonian learning biases in the loss function. We show that larger sampling time intervals result in larger training and testing errors since the chaotic behavior of the system is more evident for longer time intervals. When changing the number of layers in the network we observe that while the training error decreases when increasing the number of layers, the testing error does not because of overfitting. Finally, we empirically prove that adding Hamiltonian biases in the loss function improves the generalization properties of the model and results in smaller test errors, generalization gaps and more physically consistent solutions which approximately preserve total energy.