Coefficient of Determination

Coefficient of Determination

The variable r2, is called the coefficient of determination and is the square of the correlation coefficient. It is usually stated as a percent, rather than in decimal form. It has the following interpretation in the context of the data:

  • r2, when expressed as a percent, represents the percent of variation in the dependent (predicted) variable y that can be explained by variation in the independent (explanatory) variable x using the regression (best-fit) line.

  • 1 – r2, when expressed as a percentage, represents the percent of variation in y that is NOT explained by variation in x using the regression line. This can be seen as the scattering of the observed data points around, but not on the regression line.

For example, imagine you want to try to predict final exam grades (y) using the third exam grade (x) as a predictor. Imagine the correlation between is correlation between the third exam grade and the final exam grade was r = 0.6631. If that was the case, then the coefficient of determination would be r2 = 0.4397

Interpretation of r2 in the context of this example: Approximately 44% of the variation (0.4397 is approximately 0.44) in the final-exam grades can be explained by the variation in the grades on the third exam, using the best-fit line. Therefore, approximately 56% of the variation (1 – 0.44 = 0.56) in the final exam grades can NOT be explained by the variation in the grades on the third exam, using the best-fit line.


References

  1. https://courses.lumenlearning.com/introstats1/chapter/the-regression-equation/

CC LICENSED CONTENT, SHARED PREVIOUSLY