Bias and variance are types of errors
Bias refers to errors introduced by approximating a complex real-world problem with an oversimplified model. It measures how far off, on average, a model's predictions are from the true values.
High bias can cause underfitting as it fails to capture essential patterns in the data.
Variance refers to error from an overly complex model leading to overfitting. It occurs when the model is too sensitive to the idiosyncrasies of the training data so it fails to generalize to new data.
The goal is to find the right balance between bias and variance that minimizes the total error on unseen data.
The irreducible error is due to inherent noise in the data itself and cannot be reduced by the model. Therefore, we focus on minimizing bias and variance components.
An ideal model will balance complexity:
Low Bias: Captures the underlying patterns of the data well.
Underfitting not complex enough
Low Variance: Not overly sensitive to the training data so it can generalize to new data.
Overfitting too complex