Fast and Accurate Predictions from 3D Molecular Structures

Whether it is designing lightweight materials or discovering energy storage solutions for tomorrow, accurate modeling of atomistic interactions is critical.

Nature is fond of geometry, and different molecular structures show unique geometric traits. However, training neural networks to predict properties of such geometric structures comes with a unique set of challenges. Works presented here detail numerous contributions toward solving those distinct problems, including: 1) introducing one of the largest machine learning (ML) benchmark datasets for chemistry; 2) presenting graph-theoretic approaches for both development and interpretation of neural network models; 3) providing reference implementations via message-passing frameworks, such as PyTorch Geometric; and 4) exploring co-design of novel software-hardware architectures for rapid training of these models.

Water molecules

These neural networks require approaches that are cognizant of the connectivity structure as well as the 3D coordinates of atoms. This is addressed by a family of graph neural networks known as "3D-GNNs".

As molecular structures grow larger, the number of possible input states for 3D-GNNs increases exponentially. This presents a challenge for generalizability of the models and requires innovation to answer questions, such as can a model trained over a region of the chemical space be applied to another? How do we push the limits of today's ML frameworks and hardware architectures to train these models in hours?

We are a team of researchers from national laboratories, academia and industry interested in answering these questions. Our team hails from Pacific Northwest National Laboratory (USA), Argonne National Laboratory (USA), Graphcore (UK), IBM Research (USA) and the University of Washington (USA). Check out our recent work!

Acceleration of Graph Neural Network-Based Prediction Models in Chemistry via Co-Design Optimization on Intelligence Processing Units, Journal of Chemical Information and Modeling

Paper: https://pubs.acs.org/doi/abs/10.1021/acs.jcim.3c01312

Keywords: AI accelerators, GNNs

Reducing Down(stream)time: Pretraining Molecular GNNs using Heterogeneous AI Accelerators. NeurIPS Workshop on Machine Learning and the Physical Sciences (2022)

Paper: https://arxiv.org/abs/2211.04598

Code: https://github.com/pnnl/downstream_mol_gnn

Keywords: Transfer learning, heterogeneous accelerators

HydroNet: Benchmark Tasks for Preserving Long-range Interactions and Structural Motifs in Predictive and Generative Models for Molecular Data, NeurIPS Workshop on Machine Learning and the Physical Sciences (2020)

Paper: https://arxiv.org/abs/2012.00131

Code: https://github.com/exalearn/hydronet

Keywords: Benchmark dataset and tasks

A look inside the black box: Using graph- theoretical descriptors to interpret a Continuous-Filter Convolutional Neural Network (CF-CNN) trained on the global and local minimum energy structures of neutral water clusters. Journal of Chemical Physics (2020)

Paper: https://aip.scitation.org/doi/10.1063/5.0009933

Code: https://github.com/exalearn/hydronet

Keywords: Model interpretability, 3D GNN

Page updated

Google Sites

Report abuse