Vaxi-DL used two categories of features for each of the protein sequences: biological and physicochemical features. During the evaluation of the model's baseline performance, 3 different experiments were carried out:
1. 9154 physicochemical features were annotated for each of the 574 protein sequences in the original data.
2. 20 biological features were annotated for each of the 574 protein sequences in the original data.
3. A combined total of 9174 features (physicochemical + biological) were annotated for each of the 574 protein sequences in the original data.
In all the 3 experiments, an independent data set consisting of 50 protein samples was kept for testing. The remaining 524 samples were internally split in the following ratio:
a) 80% protein samples for Training
b) 20% protein samples for Validation
The final model was tested with the independent data set for bench-marking.