Welcome to Foundation of Data Science Laboratory
Welcome to Foundation of Data Science Laboratory
Assignment no. 4. 4. Hands-on Exercises with scikit-learn Library
1. Implement a Decision Tree Classifier:
o Train and evaluate a Decision Tree Classifier on the Iris dataset.
A Decision Tree Classifier on the Iris dataset, train it, and evaluate its performance using key metrics like accuracy, confusion matrix, and classification report.
# Import necessary libraries
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
# Load the Iris dataset
iris = load_iris()
X, y = iris.data, iris.target
# Split the data into training and testing sets (70% train, 30% test)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# Initialize the Decision Tree Classifier
clf = DecisionTreeClassifier(random_state=42)
# Train the classifier
clf.fit(X_train, y_train)
# Predict the test set results
y_pred = clf.predict(X_test)
# Evaluate the performance
# 1. Accuracy
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)
# 2. Confusion Matrix
cm = confusion_matrix(y_test, y_pred)
print("\nConfusion Matrix:")
print(cm)
# 3. Classification Report (Precision, Recall, F1-score)
report = classification_report(y_test, y_pred, target_names=iris.target_names)
print("\nClassification Report:")
print(report)
Output Example:
Accuracy: 1.0
Confusion Matrix:
[[16 0 0]
[ 0 11 0]
[ 0 0 8]]
Classification Report:
precision recall f1-score support
setosa 1.00 1.00 1.00 16
versicolor 1.00 1.00 1.00 11
virginica 1.00 1.00 1.00 8
accuracy 1.00 35
macro avg 1.00 1.00 1.00 35
weighted avg 1.00 1.00 1.00 35
Accuracy: Measures the percentage of correct predictions.
Confusion Matrix: Shows the counts of true positives, true negatives, false positives, and false negatives for each class.
Classification Report: Provides the precision, recall, and F1-score for each class (Setosa, Versicolor, Virginica).
In this case, with a Decision Tree classifier, we get an accuracy of 1.0 (perfect classification) on the test set, though performance may vary with different random states or dataset splits.