The rm2 metrics in validation of QSAR models


QSAR and its importance


The quantitative structure-activity relationship (QSAR) modeling is a computational tool dealing with the correlation between biological activity/toxicity/property of a molecule and its structural features [1, 2]. In QSAR study, the variations of biological activity/toxicity/property within compounds of a congeneric series are correlated with changes in measured or computed features of the molecules referred to as descriptors. These descriptors measure properties of the molecules which broadly include their hydrophobic, steric and electronic features in addition to the various structural patterns. QSAR models developed employing a series of molecules with a definite response help in screening large databases of new molecules bearing the specific response [3]. It thus cuts short the huge expenditure of money and time for the preliminary experimental studies. The QSAR technique thus provides an alternative pathway for design and development of new molecules with improved/desired response pattern. The pharmacophoric features and descriptors obtained from the developed QSAR models may also be utilized for virtual screening [4] of large libraries of diverse compounds for a definite response parameter. Besides this, the identification of the prime features imparting improved activity to the molecules under a particular study facilitates the in silico design of new molecules with enhanced potency. Thus, a focused library [4] may be developed by compiling the newly designed molecules with a specific response.


Validation of a QSAR model


Validation is a crucial aspect of any QSAR modeling. It is the process by which the reliability and relevance of a procedure are established for a specific purpose [5]. Many a time, formal validation is one of the most overlooked steps in the model development. For many in the QSAR community, the validation of a model is little more than an assessment of statistical fit and, occasionally, predictivity using cross-validation techniques. However, it is now being accepted that validation is a more holistic process that includes assessment of issues such as data quality, applicability of the model and mechanistic interpretability in addition to statistical assessment [6]. Any QSAR model needs to be properly validated before its use for interpreting and predicting biological responses of non-investigated compounds. There exists a number of ways to express the performance of a model. The conventional approach adopted in QSAR analysis, based on multiple linear regression, is to consider R2, adjusted R2 or Ra2 (the explained variance), and s (the standard error of estimate) [7]. However, acceptable values of these statistical parameters are not always sufficient enough to judge model predictivity and alternative methods are employed to assess the predictive ability of the developed QSAR models. To optimally determine the predictive quality of the models, these are required to be further validated using various validation techniques. Both internal and external validations are performed to assess to reliability and the predictive potential of the developed models. The conventional validation strategies include the calculation of cross validated squared correlation coefficient (Q2) for internal validation [8] and the predictive squared correlation coefficient (R2pred) for external validation [9], both bearing threshold value of 0.5.


The internal validation [8] procedure involves the leave-one-out (LOO) or leave-many-out (LMO) cross-validation technique followed by the calculation of the cross-validated squared correlation coefficient, LOO-Q2 or LMO-Q2. These techniques involve removal of one or group of compounds from the training set followed by development of the QSAR model based on the reduced dataset. The model thus built with the remaining molecules is used to predict the response of the deleted compound/compounds. This cycle is repeated till all the molecules of the dataset have been deleted once. The cross-validated squared correlation coefficient (LOO-Q2) is calculated according to the following formula (Eq. 1).


 Q2 = 1 – [sqrt(sum(Yobs_train –Ypred_train)^2)] / [sum((Yobs_train -Average_Yobs_train)^2)]          (1)


In the above equation, Yobs(train) is the observed response (training set), Ypred(train) is the predicted response  of the training set molecules based on the LOO/LMO technique while  is the mean response data of the training set compounds. A problem with LOO cross-validation is that a small change in the data can cause a huge variation in the type of the QSAR model selected. Thus, a QSAR or QSPR (quantitative structure-property relationship) model is chiefly valued in terms of its predictivity, indicating its ability to predict the response parameter for compounds not used in developing the correlation, i.e. molecules not included in the training set. Such a procedure for checking model predictivity based on molecules not included in the training set is referred to as external validation. The QSAR model thus developed is used for response prediction of the test set molecules followed by the estimation of the external predictive parameter (R2pred) (Eq. 2) [9] which reflects the degree of correlation between the observed and predicted activity data for the test set molecules, thereby ensuring the model predictive ability.


 R2pred = 1 – [sqrt(sum(Yobs_test –Ypred_test)^2)] / [sum((Yobs_test --Average_Yobs_train)^2)]         (2)



In Eq. (2), Yobs(test) and Ypred(test) are the observed and predicted response data respectively for the test set compounds.


From the above equations, it can be noted that the values of Q2 and R2pred are dependent on the mean response value of the training set compounds and its distance from each of the response values of the corresponding training and test set compounds respectively. As the denominator term in both the equations increases [sum((Yobs --Average_Yobs_train)^2], the values of the internal and external predictive parameters increase, apparently suggesting improved predictive ability of the developed QSAR model. Thus, a dataset comprising of molecules exhibiting a wide response range may show significantly acceptable values for the two parameters, although large differences may exist between the predicted and corresponding observed response values for the training and test set molecules. To better indicate both the internal and external predictive capacities of a QSAR model and to ascertain the proximity in the values of the predicted and observed response data, the rm2 metrics (average rm2 and delta rm2) developed by Roy et al [10, 11] are calculated.


           Average rm2 = (rm2 + r/m2)/2                                                                 (3)


Delta rm2 = abs(rm2 r/m2)                                                          (4)


Here, rm2 = r2 x [1 – sqrt (r2- r02)] and r/m2 = r2 x [1 – sqrt (r2- r/02)]. Squared correlation coefficient values between the observed and predicted values of the test set compounds (LOO predicted values for training set compounds) with intercept (r2) and without intercept (r02) were calculated for determination of rm2. Change of the axes gives the value of r/02 and the r/m2 metric is calculated based on the value of r/02. The correlation between the observed (y) and predicted (x) values is same to that between the predicted (y) and observed (x) values in the presence of an intercept of the corresponding least squares regression lines. However, this is not true when the intercept is set to zero. Thus, the value of r/m2 will be different from that of rm2 and the difference (delta rm2) between these two metrics may also be used as a measure of the goodness of predictions. Moreover, as either of rm2 or r/m2 may penalize heavily the quality of the model in terms of predictions, an average (average rm2) between the two is calculated. The calculation of the rm2 metrics for the training set [average rm2 (LOO) and delta rm2 (LOO)] determines reliability of the developed model while that of test set data [average rm2 (test) and delta rm2 (test)] estimates the closeness between the values of the predicted and the corresponding observed response data. The overall performance of the QSAR models may also be checked using the overall validation parameters like average rm2 (overall) and delta rm2 (overall). In addition to the traditional parameters involved in judging the predictive potential of a QSAR model, the rm2 metrics have been extensively used by Roy and co-workers [12-16] as well as other groups of researchers [17-21] to assess the prediction power of the QSAR models. The use of rm2 metrics has been implemented in the CORAL freeware available at http://www.insilico.eu/coral. QSAR models bearing acceptable values for all the traditional parameters can be finally assessed based on the rm2 metrics. Those with average rm2 values above the threshold of 0.5 and with a delta rm2 value less than 0.2 are considered to be predictive and reliable ones. A web service for computation of rm2 is now available at .




  1. Helguera, A.M.; Combes, R.D.; Gonzalez, M.P.; Cordeiro, M.N. Applications of 2D descriptors in drug design: a DRAGON tale. Curr. Top. Med. Chem., 2008, 8, 1628-1655.
  2. Todeschini, R.; Consonni, V. Handbook of Molecular Descriptors. Wiley-VCH: Weinheim, 2000.
  3. Hoffman, B.T.; Kopajtic, T.; Katz, J.L.; Newman, A.H. 2D QSAR modeling and preliminary database searching for dopamine transporter inhibitors using genetic algorithm variable selection of Molconn Z descriptors. J. Med. Chem., 2000, 43, 4151-4159.
  4. Tikhonova, I.G.; Baskin, I.I.; Palyulin V.A.; Zefirov; N.S. Virtual screening of organic molecule databases. Design of focused libraries of potential ligands of NMDA and AMPA receptors. Russ. Chem. B+, 2004, 53, 1335-1344.
  5. Balls, M.; Blaauboer, B. J.; Fentem, J. H. et al. Practical aspects of the validation of toxicity test procedures–the report and recommendations of ECVAM workshop 5. ATLA, 1995, 23, 129 -147.
  6. Aptula, A. O.; Jeliazkova, N. G.; Schultz, T. W.; Cronin, M. T. D. The better predictive model: high q2 for the training set or low root mean square error of prediction for the test set? QSAR Comb. Sci., 2005, 24, 385 -396.
  7. Snedecor, G.W.; Cochran, W.G. Statistical Methods. Oxford & IBH: New Delhi, 1967.
  8. Wold, S. Validation of QSARs. Quant. Struct. Act. Relat., 1991, 10,191 -193.
  9. Golbraikh, A.; Tropsha, A. Beware of q2! J. Mol. Graph. Mod., 2002, 20, 269-276.
  10. Roy, K.; Mitra, I.; Kar, S.; Ojha, P.; Das, R. N.; Kabir, H. Comparative studies on some metrics for external validation of QSPR models. J. Chem. Inf. Model., 52, 2012, 396-408.
  11. Ojha, P. K.; Mitra, I.; Das, R. N.; Roy, K. Further exploring rm2 metrics for validation of QSPR models dataset. Chemom. Intell. Lab. Syst., 2011, 107, 194–205.
  12. Ojha, P. K.; Roy, K. Comparative QSARs for antimalarial endochins: Importance of descriptor thinning and noise reduction prior to feature selection. Chemom. Intell. Lab. Syst., 2011, 109, 146-161
  13. Roy, K.; Roy, P.P. Exploring QSAR and QAAR for inhibitors of cytochrome P450 2A6 and 2A5 enzymes using GFA and G/PLS techniques. Eur. J. Med. Chem., 2009, 44, 1941–1951.
  14. Kar, S.; Harding, A.P.; Roy K.; Popelier, P.L.A. QSAR with quantum topological molecular similarity indices: toxicity of aromatic aldehydes to Tetrahymena pyriformis. SAR QSAR Environ. Res., 2010, 21, 149 – 168.
  15. Roy, K.; Das, R.N. QSTR with extended topochemical atom (ETA) indices. 14. QSAR modeling of toxicity of aromatic aldehydes to Tetrahymena pyriformis. J. Hazard. Mater. 2010, 183, 913-922.
  16. Mitra, I.; Saha, A.; Roy, K.; Chemometric modeling of free radical scavenging activity of flavone derivatives. Eur. J. Med. Chem., 2010, 45, 5071-5079.
  17. Dashtbozorgi, Z.; Golmohammadi, H. Prediction of air to liver partition coefficient for volatile organic compounds using QSAR approaches. Eur. J. Med. Chem., 2010, 45, 2182–2190.
  18. Arkan, E.; Shahlaei, M.; Pourhossein, A.; Fakhri, K.; Fassihi, A. Validated QSAR analysis of some diaryl substituted pyrazoles as CCR2 inhibitors by various linear and nonlinear multivariate chemometrics methods. Eur. J. Med. Chem., 2010, 45, 3394-3406
  19. Toropov, A.A.; Toropova, A.P.; Benfenati, E. QSPR modeling for enthalpies of formation of organometallic compoundsby means of SMILES-based optimal descriptors. Chem. Phys. Lett., 2008, 461, 343–347.
  20. Lagos, C.F.; Caballero, J.; Gonzalez-Nilo, F.D.; David Pessoa-Mahana, C.; Perez-Acle, T. Docking and Quantitative Structure–Activity Relationship Studies for the Bisphenylbenzimidazole Family of Non-Nucleoside Inhibitors of HIV-1 Reverse Transcriptase. Chem. Biol. Drug. Des., 2008, 72, 360–369.
  21. Goodarzi, M.; Puggina de Freitas, M. MIA-QSAR modelling of activities of a series of AZT analogues: bi- and multilinear PLS regression. Mol. Simul., 2010, 36, 267–272.





Updated on September 12, 2012