The entire workflow was implemented in Python using:
PyTorch
PyTorch Geometric
ASE (Atomic Simulation Environment)
NumPy and SciPy
The GitHub repository includes:
Dataset loading and graph construction scripts
GNN model definition
BVSE pretraining pipeline
DFT fine-tuning pipeline
Scratch baseline training
Evaluation and benchmarking scripts
All experiments are reproducible using fixed random seeds and defined train/test splits.
To reproduce the results:
Download the LiTraj datasets (nebBVSE122k and nebDFT2k).
Place them in the specified data directory.
Run BVSE pretraining.
Run DFT fine-tuning.
Run evaluation on the held-out DFT test set.
Model checkpoints and evaluation scripts are included.