Confusion Matrix :
A Confusion Matrix [CM] is a table that summarizes the performance of a classification model. It shows how well the model is predicting different classes by comparing its predictions to the actual values. In essence, it tells you where the model is getting confused.
From Product Management perspective, CM is vital in comparing different ML models and how they aligned with the business objectives such as high precision vs high recall use cases, and the cost of incremental improvements.
Binary Classification
import matplotlib.pyplot as plt
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay
import numpy as np
# Example data for a binary classification problem
y_true = np.array([0, 1, 0, 1, 1, 0])
y_pred = np.array([0, 1, 1, 1, 0, 0])
# 1. Create the confusion matrix
cm = confusion_matrix(y_true, y_pred)
# 2. Instantiate the ConfusionMatrixDisplay
disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=['Class 0', 'Class 1'])
#3 Calculate True Positives, True Negatives, False Positive and False Negatives
TN, FP, FN, TP = cm.ravel() # for binary classification
print(f' TP = {TP}, TN = {TN}, FP = {FP}, FN = {FN}')
# 4. Plot the confusion matrix and get the axes object
disp.plot(cmap=plt.cm.Blues)
Note : In binary classification, recall of the positive class is also known as "sensitivity"; recall of the negative class is "specificity
Multi Class Classification
import matplotlib.pyplot as plt
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay
import numpy as np
# Example data for a binary classification problem
y_true1 = [0, 1, 2, 2, 2]
y_pred1 = [0, 0, 2, 2, 1]
# 1. Create the confusion matrix
cm1 = confusion_matrix(y_true1, y_pred1)
print(type(cm1))
print(cm1)
# 2. Instantiate the ConfusionMatrixDisplay
target_names = ['class 0', 'class 1', 'class 2']
disp1 = ConfusionMatrixDisplay(confusion_matrix=cm1, display_labels=target_names)
3 Calculate True Positives, True Negatives, False Positive and False Negatives
num_classes = cm1.shape[0] # Get the number of classes
TP = np.diag(cm1) # True Positives for each class are on the diagonal
FP = cm1.sum(axis=0) - TP # False Positives are the sum of the column minus the TP
FN = cm1.sum(axis=1) - TP # False Negatives are the sum of the row minus the TP
TN = cm1.sum() - (TP + FP + FN) # True Negatives for each class are all other correct predictions
print("True Positives per class:", TP)
print("True Negatives per class:", TN)
print("False Positives per class:", FP)
print("False Negatives per class:", FN)
# 4. Plot the confusion matrix and get the axes object