Definition: we want to learn a target function f that maps input X to output Y with error e Y=f(X)+e
Linear: simplify the mapping to a known linear combination form and learning its coefficients
Non-linear: free to learn any functional form
Bias-Variance trade-off
$$Error = bias^2 + variance +irreducible error$$
Bias refers to the error between average model prediction and ground truth.
High bias ==> model is too simple, under fitting
Variance refers to sensitivity of model to changes in the training data (avg viability in the model prediction for the given dataset.)
High vairance ==> model is too complicated, over fitting
Accuracy = Correct Predictions/ Total predictions
Accuracy that could not always give the correct insight about the trained model
Precision: exactness of model = TP/(TP+FP)
Recall : completeness of the model = TP/(TP+FN)
Data replication: replicate the available data until comparable
Synthetic Data: crop, add noise, rotate, create new data
Modified loss: modify the loss to reflect greater error when miss-classifying smaller sample set
Change Alg. In crease the complexity of model
Based on the dataset find a new set of orthogonal feature vectors in such a way that the data spread is maximum in the direction of the feature vector
Rate the feature of vector in the decreasing order of data spread
The datapoints have maximum variance in the first feature vector and minimum variance in the last feature vector