Workshop on the Future of Machine Learning and Data Analytics across the Department of Energy


April 2-3, Lawrence Berkeley National Laboratory

This symposium aims to discuss the mid-term long-term (10~20 years) needs and requirements for the development of machine learning (ML) and data analysis (DA) techniques that will be important for future advances in various research and application areas at DOE labs. Different from extending or slightly modifying existing methods in computer science, applied mathematics and statistics, this quorum is focused on the path to sophisticated and systematic development of ML/DA frameworks in the next decade. This symposium will bring together early to mid-career researchers in applied mathematics, computational science, statistics, computer science, and domain science to brainstorm and vision the development in the next decade.

The outcome of this symposium will be a 10-15 page whitepaper reflecting a community based vision for suggested advancements in ML/DA at DOE National Laboratories for the next decade. Specifically, the symposium discussions will be driven by application areas, and the whitepaper will summarize current trends and challenges in ML/DA in those application areas, as well as the new mathematics that would be required to address these challenges.

The following are the featured topic areas of the workshop:

1. ML: Interpretability and Robustness of Machine Learning

  • How to incorporate physical constraints and domain knowledge in the ML process and how can we improve interpretability of the overall ML inference?
  • How to improve the robustness of ML inference and advance non-convexc optimization to that end?

Application Areas: Additive manufacturing, earth system model, image analysis

2. DA: Multi-Modality and High-Dimensionality in Data

  • How do we address multi-modal data sources while accounting for data quality (including noise)?
  • How to address the high-dimensional data deluge? Compression, causality, or topological analysis?

Application areas: Analysis of DOE facilities (e.g., light sources and sequencers) data output.