Alpana A. Borse

Welcome to Foundation of Data Science Laboratory

Assignment no. 4. 4. Hands-on Exercises with scikit-learn Library

1. Implement a Decision Tree Classifier:

o Train and evaluate a Decision Tree Classifier on the Iris dataset.

A Decision Tree Classifier on the Iris dataset, train it, and evaluate its performance using key metrics like accuracy, confusion matrix, and classification report.

Code to Train and Evaluate a Decision Tree Classifier on the Iris Dataset:

# Import necessary libraries

from sklearn.datasets import load_iris

from sklearn.model_selection import train_test_split

from sklearn.tree import DecisionTreeClassifier

from sklearn.metrics import accuracy_score, confusion_matrix, classification_report

# Load the Iris dataset

iris = load_iris()

X, y = iris.data, iris.target

# Split the data into training and testing sets (70% train, 30% test)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Initialize the Decision Tree Classifier

clf = DecisionTreeClassifier(random_state=42)

# Train the classifier

clf.fit(X_train, y_train)

# Predict the test set results

y_pred = clf.predict(X_test)

# Evaluate the performance

# 1. Accuracy

accuracy = accuracy_score(y_test, y_pred)

print("Accuracy:", accuracy)

# 2. Confusion Matrix

cm = confusion_matrix(y_test, y_pred)

print("\nConfusion Matrix:")

print(cm)

# 3. Classification Report (Precision, Recall, F1-score)

report = classification_report(y_test, y_pred, target_names=iris.target_names)

print("\nClassification Report:")

print(report)

Output Example:

Accuracy: 1.0

Confusion Matrix:

[[16 0 0]

[ 0 11 0]

[ 0 0 8]]

Classification Report:

precision recall f1-score support

setosa 1.00 1.00 1.00 16

versicolor 1.00 1.00 1.00 11

virginica 1.00 1.00 1.00 8

accuracy 1.00 35

macro avg 1.00 1.00 1.00 35

weighted avg 1.00 1.00 1.00 35

Explanation:

Accuracy: Measures the percentage of correct predictions.
Confusion Matrix: Shows the counts of true positives, true negatives, false positives, and false negatives for each class.
Classification Report: Provides the precision, recall, and F1-score for each class (Setosa, Versicolor, Virginica).

In this case, with a Decision Tree classifier, we get an accuracy of 1.0 (perfect classification) on the test set, though performance may vary with different random states or dataset splits.

Page updated

Google Sites

Report abuse