To evaluate the impact of transfer learning, we trained three models:
Trained only on DFT data (80% split).
No pretraining.
Evaluated on 20% held-out DFT test set.
This serves as the baseline.
Model trained on 122,421 BVSE barriers.
Learns general structural diffusion features.
No exposure to DFT barriers during this stage.
Initialize model with pretrained BVSE weights.
Fine-tune on 80% DFT training data.
Evaluate on same held-out DFT test set.
This isolates the effect of pretraining.
The BVSE dataset is large but approximate. The DFT dataset is small but accurate.
The hypothesis:
Pretraining on approximate physics allows the model to learn structural representations that transfer to high-fidelity DFT prediction.