Regression is a set of different methods that try to find a relation between the features of a sample and the label where the label is getting continuous values. There are many types of regressions, but there are some more popular families that include:
Linear models
Generalized linear models
Nonlinear models
The linear model tries to provide a linear relation to explaining the label values based on the feature values. A generalized linear model consists of two parts, first a linear combination of the features that are the main predictor, second, the linear part is linked to the label with a nonlinear function. the first method is a particular example of the second one. The third model, which generalizes both the other methods, just assumes a nonlinear model in general between the features and label.
The main assumption in regression is that the label value consists of a part that can be explained with the features and a part that cannot be explained and is taken as a noise.
One important question in regression, apart from all measures that are of major concern in ML, is the goodness-of-fit of the models. Here we briefly discuss two measures.
The coefficient of determination is the square of the correlation coefficient between two variables. R squared is the proportion of the variance in the dependent variable that is predictable by the independent variable(s). R- squared is usually used to measure the model capability for the prediction. It shows how much of the label value can be explained by the relation in the regression. In other words, it is the proportion of the explained part in the total variance of a model.
R-squared is usually used for linear models and it is not very useful for non-linear models. However, a useful extension, which can also be properly used for non-linear models, is as follows: the R-squared in the linear model happens to be the correlation between the real label values and the predicted ones by the model. So one can extend the definition in the same way. This means we can always measure the correlation between the real labels and the predicted ones. However, one must note that the correlation between two variables, and the correlation between linear transformations of them are equal. So we have to be careful while having a large R-squared we need to always think of a linear adjustment.
There is another measure which is called the standard error of the regression. Actually, this is nothing but the square root of the model variance which is denoted by S. S represents the average distance that the observed values fall from the regression curve. There are good points about this measure, first, it is in terms of the problem unites (unlike R-squared), second roughly, 95 percent of the observations lie within the plus-minus standard error neighborhood of the regression curve.
The best practice is to use a combination of the two measures in the analysis of regression. For linear, it seems R-squared is sufficient but for non-linear regression using both is very helpful.