Linear Regression - Least Squares Estiamtes
Assume there are points (x1, y1), (x2, y2) ... (xn, yn)
The target is to find a linear function Y=b+kX (i.e get the bias b and the slope k)
such that SUM(yi - (b+kxi))^2, i=1 to n, is minimized, here (b+kxi) is the estimated value of Y and yi is the actual value of Y
Let RSS = SUM(yi - (b+kxi))^2, when RSS is minimized, the deravative of the function is zero
=> d(RSS)/d(b) = SUM(-2(yi-(b+kxi))) = 0
=> SUM(yi-(b+kxi)) =0, i=1 to n
=> SUM(yi) - SUM(kxi) = SUM(b) = nb, i=1 to n
=> SUM(yi)/n - kSUM(xi)/n = b
=> y' - kx' = b
here y' is the average of Y and x' is the average of X, which are easy to solve
Rewrite RSS = SUM(yi - (y'-kx') -kxi)^2, i=1 to n
=> RSS = SUM(yi - y'-k(xi-x'))^2
=> d(RSS)/d(k) = SUM(-2(yi-y'-k(xi-x'))(xi-x')) = 0
=> k= SUM((yi-y')(xi-x')) / SUM(xi-x')^2
So b and k are solved!