Laymen explanation
We can continue to explore and see how far we can push our existing conceptualisation of the problem we’re working on. It can also tell us if our approach isn’t working — this turns out to be even more helpful because if our adjustments are making the model worse at what it’s supposed to be doing, then it indicates that we may have misunderstood our data or the situation being modelled.
So evaluation of model performance is useful — but how exactly do you do it? This document helps in this regard.
More accurate models that will help us make better decisions in real world applications. Evaluation approach may be done in two ways
Without needing to train model (using mathematics)
Train multiple classifiers and then select the best one
Misclassification probability (error in classification and cost of error) is considered here as performance parameter
This approach is very good since there is no need to train model. Also, result is not affected with actual dataset you have. Below is the step for doing this.
Bayes decision rule provides the optimal misclassification error and no classifier can have better misclassification error than this. So it acts as benchmark. This document talks about it in detail
For a problem domain, given class probability distribution(PDF) and prior probability, mathematically we can compare performances without actually training the classifier. Steps are below
Find the class prior probability and class PDFs
Calculate optimal misclassification probability (means Bayes decision rule)
For each classifier
Calculate misclassification probability for each classifier
Compare its value with optimal and record the difference
Select the classifier whose misclassification error is closet to Bayes probability
Note that in above approach, prior probability and class probability knowledge is must. If it is not known, then there are other approaches mentioned below.
In this approach, all classifiers are trained using same data set and then there performances are compared.
Leave-one-out cross-validation is one such approach. Leave-one-out cross-validation is essentially an estimate of the generalisation performance of a model trained on 𝑛−1 samples of data and validation with left out one sample. k-fold is another model.
These approach has challenges as below
time complexity to train models can be high.
Also, the result is based on dataset. It means, changing dataset may impact the result
Mathematical approach for model comparison has high value and should also be explored due to reasons mentioned above. Also, this result can be shared to community as people working on similar problem can get benefitted.
https://youtu.be/aeEv3-tSvjM
https://youtu.be/xhBBv8huHyI?t=1950
https://youtu.be/0SNSVjeTXao?t=3195
https://stats.stackexchange.com/questions/27454/how-does-leave-one-out-cross-validation-work-how-to-select-the-final-model-out
https://machinelearningmastery.com/loocv-for-evaluating-machine-learning-algorithms/
https://machinelearningmastery.com/k-fold-cross-validation/
https://images.app.goo.gl/AsaEZ6vBtjuyG1td7
https://images.app.goo.gl/ynGBVvKxpZtcVKPK9