ML for Scientific Data - CU

Research Highlight

Discovering State Variables Hidden in Experimental Data

Achievement

All physical laws are described as relationships between state variables that give a complete and non-redundant description of the relevant system dynamics. However, despite the prevalence of computing power and AI, the process of identifying the hidden state variables themselves has resisted automation. Most data-driven methods for modeling physical phenomena still assume that observed data streams already correspond to relevant state variables. A key challenge is to identify the possible sets of state variables from scratch, given only high-dimensional observational data. Here we propose a new principle for determining how many state variables an observed system is likely to have, and what these variables might be, directly from video streams. We demonstrate the effectiveness of this approach using video recordings of a variety of physical dynamical systems, ranging from elastic double pendulums to fire flames. Without any prior knowledge of the underlying physics, our algorithm discovers the intrinsic dimension of the observed dynamics and identifies candidate sets of state variables. We suggest that this approach could help catalyze the understanding, prediction and control of increasingly complex systems.

Figure 1: Two-stage modeling of dynamical systems. (A) and (B) First stage: intrinsic dimension estimation. We first modelled the dynamical systems via the evolution from Xt to Xt+dt with a fully convolutional encoder-decoder network directly from video observations. The dimension of the latent vectors Lt→t+dt is often much lower than the dimension of the input vectors, but much higher than the intrinsic dimension of the system. To identify the intrinsic dimension of the system, we applied geometric manifold learning algorithms on this set of relative high dimension latent vectors. (C) Second stage: discover Neural State Variables. We applied another encoder-decoder network on top of the above latent vectors to automatically determine the Neural State Variables by limiting the latent dimension of this network with the identified intrinsic dimension. Our two-step approach can produce Neural State Variables with the exact dimension of the system intrinsic dimension. (D) Once we determine the Neural State Variables, we can leverage the system dynamics in the space of Neural State Variables as an indicator of dynamics stability. Therefore, we learn a neural latent dynamics to predict the the Neural State Variables at the next time step from the current Neural State Variables.

Publication

Chen, Boyuan and Huang, Kuang and Raghupathi, Sunand and Chandratreya, Ishaan and Du, Qiang and Lipson, Hod, Automated discovery of fundamental variables hidden in experimental data. Nature Computational Science 2, 433–442 (2022).