R2 Score

A dataset y = [y1, y2, ... yn]

A prediction set f = [f1, f2, ... fn]

Now we want to check how well the prediction f fits the original dataset y.

first, define the residual e = y - f

so [e1, e2, ... en] = [y1-f1, y2-f2, ... yn-fn]

The sum of squares of residuals SUM(e^2) = e1^2 + e2^2 +...ek^2 describes the variance between y and f

secondly, define the variance v = y - mean(y)

so [v1, v2 ... vn] = [y1-y_mean, y2-y_mean, ... yn1-y_mean]

The sum of squares of the variances SUM(v^2) = v1^2 + v2^2 + ... vn^2 describes the variance between y and the mean of y

R2 = 1 - SUM(e^2) / SUM(v^2)

It compares the squares of errors (y-f) to the squares of variances (y - mean)

If the prediction f fits y well, then the errors are small, then SUM(e^2) / SUM(v^2) tends to be small, and R2 tends to be 1

If the prediction f is random guess, the error follows some normal distribution centered at the mean of y, and the errors (y-f)

would be similar to the variances of y (y- mean of y), therefore SUM(e^2) / SUM(v^2) tends to be 1, and R2 tends to be 0.

If someone makes a really bad prediction, so bad that it is even worse than random guess, the errors would be bigger than the

variances of y, e.g. many non-sense extremely large values, SUM(e^2) / SUM(v^2) tends to be > 1, so R2 tends to be negative