Unit 3 – Supervised Learning : Classification

🧠 Machine Learning

🧩 Unit 3 – Supervised Learning : Classification

📘 Example

1️⃣ Logistic Regression

2️⃣ Naïve Bayes

3️⃣ Support Vector Machine (SVM)

4️⃣ Decision Trees

5️⃣ Ensemble Learning

6️⃣ Confusion Matrix & Performance Metrics

🧠 Machine Learning

🧩 Unit 3 – Supervised Learning : Classification

Supervised learning is a type of machine learning where we train a model using labeled data — that means both the input and the desired output are known.
The model learns patterns and can then predict outputs for new, unseen data.

📘 Example

If we train a model on emails labeled spam or not spam, it can classify new emails automatically.

1️⃣ Logistic Regression

Definition:
A statistical model used for binary classification problems — where the output is either 0 or 1.

Equation:

�(�=1∣�)=11+�−(�0+�1�)

P(Y=1∣X)=

1+e

−(β

+β

Explanation:
It uses the sigmoid function to map values between 0 and 1.

Code Example:

from sklearn.linear_model import LogisticRegression

from sklearn.model_selection import train_test_split

from sklearn.datasets import load_iris

X, y = load_iris(return_X_y=True)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

model = LogisticRegression(max_iter=200)

model.fit(X_train, y_train)

print("Accuracy:", model.score(X_test, y_test))

Output:

Accuracy: 0.93

Diagram (Sigmoid Curve):

2️⃣ Naïve Bayes

Definition:
A probabilistic classifier based on Bayes’ Theorem, assuming all features are independent.

Formula:

�(�∣�)=�(�∣�)�(�)�(�)

P(A∣B)=

P(B)

P(B∣A)P(A)

Example:
Used in spam filtering and sentiment analysis.

Code Example:

from sklearn.naive_bayes import GaussianNB

from sklearn.datasets import load_iris

X, y = load_iris(return_X_y=True)

model = GaussianNB()

model.fit(X, y)

print(model.predict([[5.1, 3.5, 1.4, 0.2]]))

Output:

Predicted Class: [0]

Diagram:

3️⃣ Support Vector Machine (SVM)

Definition:
A classifier that finds the best decision boundary (hyperplane) to separate different classes.

Code Example:

from sklearn.svm import SVC

from sklearn.datasets import load_iris

X, y = load_iris(return_X_y=True)

model = SVC(kernel='linear')

model.fit(X, y)

print("Accuracy:", model.score(X, y))

Output:

Accuracy: 0.98

Diagram:

4️⃣ Decision Trees

Definition:
A tree-shaped model that splits data into branches based on feature values.

Example:
Used in customer churn prediction.

Code Example:

from sklearn.tree import DecisionTreeClassifier

from sklearn.datasets import load_iris

X, y = load_iris(return_X_y=True)

model = DecisionTreeClassifier()

model.fit(X, y)

print("Accuracy:", model.score(X, y))

Output:

Accuracy: 1.0

Diagram:

5️⃣ Ensemble Learning

Definition:
Combines multiple models to improve performance (e.g., Bagging, Boosting, Random Forest).

Example:

from sklearn.ensemble import RandomForestClassifier

from sklearn.datasets import load_iris

X, y = load_iris(return_X_y=True)

model = RandomForestClassifier()

model.fit(X, y)

print("Accuracy:", model.score(X, y))

Output:

Accuracy: 1.0

Diagram:

6️⃣ Confusion Matrix & Performance Metrics

Confusion Matrix:

Metrics:

Accuracy = (TP + TN)/(TP + TN + FP + FN)
Precision = TP / (TP + FP)
Recall = TP / (TP + FN)
F-score = 2 × (Precision × Recall)/(Precision + Recall)
AUC-ROC = Area under the Receiver Operating Characteristic curve.

Page updated

Report abuse