Research

M. Chertkov Research Statement (11/2023)

[This document's references correspond to those listed in my curriculum vitae]

My research, spanning over four decades, is rooted in physics and it expands into a diverse array of other sciences. This journey forayed into contemporary applied mathematics including information and control theory, computer science, operations research, machine learning, and Artificial Intelligence (AI). In the following I explain my motivations, goals, and strategic vision, and also provide a tactical perspective on my research endeavors. Not surprisingly I will start with a primer answering the question: What is AI?

In today's discourse, the term "Artificial Intelligence" (AI) frequently dominates conversations. However, the interpretation and significance attached to this compact abbreviation vary greatly among individuals. My use of the term is very technical. The AI I discuss includes:

·   Automatic Differentiation: This involves speeding up the basic elements of analog computation using a set of elementary operations and the chain rule. It enables accurate and efficient gradient evaluations for optimization and machine learning.

·   Deep Learning: It encompasses the most efficient parametric representation of continuous-valued functions, as compositions of parameterized, nonlinear transformations, using Neural Networks with many layers. It enables the approximation of complex relationships within data through iterative optimization -- thus learning -- of these (many billions) of parameters.

·   Reinforcement Learning: This is a data-driven approach to implementing optimal control under uncertainty, like navigating robots. It is an adaptive approach in machine learning, which reinforces decisions based on the information, e.g. reward, received in the process of learning (exploration), to get better inference (exploitation).

·   Generative Models: This refers to AI models and algorithms that can generate new, previously unseen data that is similar in statistical properties to the training data or complete (answer the questions posed in) input data.

The list is progressive -- each next item depends on, or at least related to, the previous one.

In my opinion, the excitement within the scientific community surrounding AI stems from its unique ability to tackle the challenges posed by the curse of dimensionality in numerous scientific domains heavily reliant on computations. AI has illuminated the notion that computational hardness, or the inability to efficiently solve problems within existing frameworks and models, is not a hindrance. Instead, it serves as an invitation to circumvent these limitations and craft remarkable tools. One major lesson we have learned from AI is that when faced with unsolvable problems (theoretically and/or practically), we can change the formulations and borrow approaches and methods from other disciplines, notably applied mathematics, and statistical physics. A prime example of this ingenuity is the advent of Diffusion Models and Large Language Models, which revolutionize both model formulation and the applications of the tools.

It is time to introduce what I mean by "my sciences" — those in which I can claim some expertise:

·   Physical Sciences: I prioritize physical sciences because my background is firmly rooted in theoretical and mathematical physics, as well as applied mathematics. Within the realm of physical sciences, my primary focus centers on statistical mechanics, fluid dynamics, and, more specifically, turbulence, often referred to as statistical hydrodynamics among theorists. See my recent exemplary papers, e.g. [227,229,233,235]. Additionally, I maintain a keen interest, albeit with fewer published papers, in various other physics disciplines spanning material-, quantum, astro-, and geo- physics.

·   In the realm of Social Sciences, my fascination lies in the interactions of agents, be they humans, animals, or robots. I do statistical modeling, ensemble analysis, and spatio-temporal coarsening, particularly when studying collective social phenomena -- such as the recent pandemic. See [222,226,230].

·   Engineering Sciences: The mention of robots above in the context of control, highlights my interest in ``theoretical engineering", which may be viewed as an element of what I discuss below under the rubric of ``Science of AI". On the application side of engineering sciences, I am predominantly interested in the study of energy system networks, including power, natural gas, and district heating-cooling systems. While working on these topics I am focusing on understanding, proposing novel approaches, and developing algorithms for constructing, reinforcing, managing, and controlling these energy systems. (See e.g. my papers on the topic from the last year [225,231,232,234,239].) This involves accounting for customer and operator behavioral factors and integrating the energy system seamlessly with other critical infrastructures, including transportation.

The Science of AI: Here I would like to explain my vision on how we have advanced our  understanding, developed and continue to create AI through innovative mathematical and statistical approaches and algorithms. It's not an exaggeration to claim that the majority, if not all, of AI innovations have their roots in applied mathematics and statistical mechanics (I view the later as a part of the former). My interests and contributions to AI encompass various aspects under the umbrella of applied mathematics, including:

·   Statistical Inference and Learning: I study graphical models, employing stochastic and variational methods to enhance statistical inference and learning. This involves expressing statistical information using graphs, which typically represent underlying constraints or relationships. (See the following exemplary papers [54,57,199].)

·   Stochastic Optimal Control: This encompasses various aspects, linked to its data-driven counterpart, reinforcement learning in its many forms, including these most popular today which utilize deep neural networks for enhanced performance. See [112,128,168,225,235].

·   Diffusion Models in AI: Quite a lot of my most recent focus is on diffusion models -- a remarkable tool in generative AI. These models are renowned for their ability to generate synthetic images based on textual prompts or training set of images, considered as i.i.d. (independent, identically distributed) samples from a probability distribution. What makes the diffusion models even more intriguing is their foundation in the clever utilization of stochastic ordinary differential equations and time-inversion principles, stemming from non-equilibrium statistical mechanics and the stochastic optimal control. See [236].

·   AI Transformers: Another remarkable tool in generative AI, transformers, underpin the most recent and exciting developments in the field, including Large Language Models like ChatGPT. We are now realizing that the remarkable efficiency of transformer models is ultimately linked to their interpretation as dynamical systems, specifically the capacity of multi-particle dynamical systems to exhibit clustering or chaotic behavior.

How Does (and Will) AI Affect Sciences? To comprehend the influence of AI on scientific research, I will begin with a general discussion of physical sciences. However, and as we proceed, I will illustrate my points with specific examples, and to make it crisp primarily those drawn from statistical hydrodynamics.

Physics-Blind AI discovers Physics: In the turbulence research community, AI tools have been increasingly applied, specifically Deep Neural Networks within supervised learning (treating turbulence as an input-output function) or generative models for handling spatial or spatio-temporal snapshots of turbulence. These AI tools have been employed to process these snapshots, treating them as if they were images of celebrities or segments from movies. Remarkably, this rather straightforward AI approach has yielded significant insights. For instance, it has been observed that even without directly inputting physical equations, AI models have uncovered their consequences. For example, Generative Adversarial Networks, one of the earliest tools in the generative model’s family of AI, have produced energy spectra in synthetic images that closely align with actual turbulence data [198]. Furthermore, AI has unearthed intricate correlations specific to turbulence, although not all of them. In summary, in this ``System 1” application of AI, physics was not initially integrated into the process but was applied retrospectively to evaluate whether AI could faithfully preserve the physical principles. (Here we utilize terminology of D. Kahneman: System 1 reasoning is rapid, intuitive, automatic, and unconscious; it generates solutions instantaneously based on our past training; System 2 reasoning is slower, deeper, logical, and effortful. It requires our full attention and aids us in creative problem-solving.)

Exploring Physics-Informed AI: The natural next question arises once we understand turbulence through the System 1 approach: What should we do when the System 1 way of using AI falls short? Naturally, the community began contemplating the integration of the physics of turbulence directly into AI, giving rise to the field of Physics-Informed AI. Several System 2 AI approaches that incorporate physics have been proposed, and we will discuss some of them below, emphasizing fundamental cases where Physics-Informed AI becomes a crucial consideration:

·   Interpolation: When our aim is not solely to be a focused prediction machine but also to gain insight into the equations, constraints, or symmetries that underlie the data. We do not necessarily require exact knowledge of these equations, constraints, or symmetries. Instead, we can introduce parameterized families of them, incorporate them into AI, and allow AI to determine not only hidden parameters (e.g., within Neural Networks) but also physical parameters. In this scenario, Physics-Informed AI not only provides predictions but also specific values for physical coefficients. For example, it can characterize inter-particle interaction potentials or diffusion coefficients, enabling the interpretation of underlying phenomena. See e.g. [197,198,206,214].

·   Extrapolation: This scenario arises when we possess a limited number of samples from a specific (often rare) regime of interest, but we have an abundance of data from other regimes. In such cases, the integration of physics into AI becomes invaluable, especially when we anticipate that both distinct regimes are governed by the same or similar equations. Effective extrapolation with AI is most promising when there is confidence that the equations governing high-data and low-data regimes align or bear similarities. Additionally, shared symmetries or constraints between these regimes further enhance the effectiveness of extrapolation. (See [233]. Methodology-wise, I am working on embedding ``instanton” [17,53] and non-equilbrium stat-mech [50,58,71,73,88,126] ideas into the extrapolative AI.)

·   Understanding via Reduced Modeling: Consider the curse of dimensionality, a concern in many physical disciplines heavily reliant on computations. Even when we have an equation believed to be universal and accurate for various phenomena, such as the Navier-Stokes equation in fluid mechanics, running it in challenging cases, like at high Reynolds numbers, becomes prohibitively expensive. This is where model reduction comes into play. We postulate reduced models with significantly fewer degrees of freedom (for example changing from PDE to ODE), but this introduces a multitude of options for specifying these reduced models, laden with uncertainty. Here's where physical hypotheses come handy. We discuss hypotheses that can be parameterized in terms of physically meaningful parameters (as mentioned earlier). Alternatively, we can use Neural Networks and other AI tools to express degrees of freedom about which we have limited knowledge. See [229,233,238].

Importantly, all the above relies on the so-called ground truth data, that is data which we trust. Traditional high-resolution numerical simulations which are abundant in physics-intensive disciplines, such as Direct Numerical Simulations (DNS) commonly used in turbulence research, can serve as this ground truth data.

AI for Discovering New Laws of Physics: This is a dream -- a vision yet to be realized. In this context, it's crucial to embrace the spirit of bold hypothesis formation. Gone are the days of discarding seemingly unconventional ideas. With the advent of AI techniques, particularly our newfound ability to optimize in high dimensions (thanks to automatic differentiation), we can adopt a Bayesian approach and consider multiple hypotheses concurrently, possibly guided by some intuitive priors. This shift allows us to move away from subjective hypothesis selection and rely more on data-driven approaches. AI's capacity to assess and choose more scientifically reliable hypotheses based on data offers invaluable guidance. It's important to note that this approach may not provide proof but serves as an additional tool to steer our attention towards what's worth investigating and what may be disregarded.


-----------

My old Research Statement (kirka 2018 - still Los Alamos period) can be found here: