Confusion Matrix

Confusion Matrix :

A Confusion Matrix [CM] is a table that summarizes the performance of a classification model. It shows how well the model is predicting different classes by comparing its predictions to the actual values. In essence, it tells you where the model is getting confused.

From Product Management perspective, CM is vital in comparing different ML models and how they aligned with the business objectives such as high precision vs high recall use cases, and the cost of incremental improvements.

Binary Classification

import matplotlib.pyplot as plt

from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay

import numpy as np

# Example data for a binary classification problem

y_true = np.array([0, 1, 0, 1, 1, 0])

y_pred = np.array([0, 1, 1, 1, 0, 0])

# 1. Create the confusion matrix

cm = confusion_matrix(y_true, y_pred)

# 2. Instantiate the ConfusionMatrixDisplay

disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=['Class 0', 'Class 1'])

#3 Calculate True Positives, True Negatives, False Positive and False Negatives

TN, FP, FN, TP = cm.ravel() # for binary classification

print(f' TP = {TP}, TN = {TN}, FP = {FP}, FN = {FN}')

# 4. Plot the confusion matrix and get the axes object

disp.plot(cmap=plt.cm.Blues)

Note : In binary classification, recall of the positive class is also known as "sensitivity"; recall of the negative class is "specificity

Multi Class Classification

import matplotlib.pyplot as plt

from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay

import numpy as np

# Example data for a binary classification problem

y_true1 = [0, 1, 2, 2, 2]

y_pred1 = [0, 0, 2, 2, 1]

# 1. Create the confusion matrix

cm1 = confusion_matrix(y_true1, y_pred1)

print(type(cm1))

print(cm1)

# 2. Instantiate the ConfusionMatrixDisplay

target_names = ['class 0', 'class 1', 'class 2']

disp1 = ConfusionMatrixDisplay(confusion_matrix=cm1, display_labels=target_names)

3 Calculate True Positives, True Negatives, False Positive and False Negatives

num_classes = cm1.shape[0] # Get the number of classes

TP = np.diag(cm1) # True Positives for each class are on the diagonal

FP = cm1.sum(axis=0) - TP # False Positives are the sum of the column minus the TP

FN = cm1.sum(axis=1) - TP # False Negatives are the sum of the row minus the TP

TN = cm1.sum() - (TP + FP + FN) # True Negatives for each class are all other correct predictions

print("True Positives per class:", TP)

print("True Negatives per class:", TN)

print("False Positives per class:", FP)

print("False Negatives per class:", FN)

# 4. Plot the confusion matrix and get the axes object

Page updated

Google Sites

Report abuse