To Download: Click Here

The double cross-validation process comprises two nested cross-validation loops which are referred as internal and external cross-validation loops. In the outer (external) loop of double cross-validation, all data objects are divided into two subsets referred to as training and test sets. The training set is used in the inner (internal) loop of double cross-validation for model building and model selection, while the test set is exclusively used for model assessment. So in the internal loop, the training set is repeatedly split into calibration and validation data sets. The calibration objects are used to develop different models whereas the validation objects are used to estimate the models error. Finally, the model with the lowest prediction errors (validation set) in the inner loop is selected. Then, the test objects in the outer loop are employed to assess the predictive performance of the selected model. This method of multiple splits of the training set into calibration and validation sets obviates the bias introduced in variable selection in case of usage of a single training set of fixed composition. 

Double Cross-Validation Tool v.2.0 (Last Updated 24 March 2017) performs the double cross-validation process as mentioned above. Here, the user has to provide the training and test sets (descriptors and the response variable) information in the respective input files.

To Download and Run the Program

Click on the download link above (it will direct you to google drive) and then press "ctrl + S (Windows) or cmd+S (Macs)" to save as zip file. Extract the .zip file and click on .jar file to run the program.

Note: The program folder will consist of three folders "Data", "Lib" and "Output". For user convenience, user may keep input files in "Data" folder and may save output file in "Output" folder."Lib" folder consist of library files required for running the program. Check the format of training and test sets input files (.xlsx/.xls/.csv) before using the program (sample files are provided in Data Folder). *Provisional Manual is provided in the program folder.

File Format: Compound number (first column), Descriptors (Subsequent Columns), Activity/Property (Last column)

Reference for Double Cross-Validation

1. Roy, K. and Ambure, P. 2016.  The “double cross-validation” software tool for MLR QSAR model development. Chemom Intell Lab Sys, http://dx.doi.org/10.1016/j.chemolab.2016.10.009

2. Baumann, D. and Baumann, K., 2014. Reliable estimation of prediction errors for QSAR models under model uncertainty using double cross-validation. J. Cheminformatics, 6(1), p.47. (Click here)