Machine Learning for Scientific Data Analytics at DOE User Facilities

Sponsor: DOE - Office of Advanced Scientific Computing Research

Mission of the project

The mission of this project is to establish mathematical foundations of the prioritized scientific machine learning (SciML) methods for extracting interpretable information from scientific data, inferring physical laws, and steering experiments toward scientific discovery. US Department of Energy's (DOE) scientific user facilities generate a deluge of dynamic experimental data at a rapid velocity on a daily basis. However, our ability to extract interpretable information from the massive dynamic data is far behind our ability to generate the data. The advances in machine learning have had revolutionary effects on large-scale data analytics in the business world, but it is challenging to transfer a successful method for commercial use to an effective SciML method for scientific use. Thus, a new class of mathematically rigorous and computationally efficient and reliable SciML methods are required for real-time scientific data analytics to expedite the pace of scientific discovery. To address these challenges, our team will focus on developing novel mathematical methods for extracting interpretable features from raw experimental data, inferring unknown physics (i.e., unmasking hidden dynamics), and designing and steering a series of experiments to achieve a scientific goal. To motivate, illustrate, and evaluate our new methodologies, we will apply them to neutron scattering data generated at the Spallation Neutron Source (SNS) and High Flux Isotope Reactor (HIFR) facilities, and scanning transmission electron microscopy (STEM) data generated at Center for Nanophase Materials Sciences (CNMS). The purpose of choosing these datasets is not only to demonstrate how the proposed SciML methods and mathematical analysis can help address current, urgent needs for advanced data analytics at DOE's user facilities, but also to show the critical role of the proposed research in establishing self-driving user facilities in the near future. The funding period of this project is September 2021 to August 2024.

Recent news

Research highlights