Unsupervised learning deals with unlabeled data.
The model tries to discover patterns or structures in the data without knowing the output.
Definition:
A method that groups data into K clusters based on similarity.
Steps:
Choose K centroids.
Assign points to the nearest centroid.
Recompute centroids and repeat.
Code Example:
from sklearn.cluster import KMeans
from sklearn.datasets import load_iris
X, _ = load_iris(return_X_y=True)
kmeans = KMeans(n_clusters=3, n_init=10)
kmeans.fit(X)
print("Cluster Centers:\n", kmeans.cluster_centers_)
Output:
Cluster Centers:
[[5.9 2.7 4.2 1.3]
[6.8 3.0 5.6 2.1]
[5.0 3.4 1.4 0.2]]
Definition:
Builds a hierarchy of clusters using either agglomerative or divisive methods.
Diagram (Dendrogram):
Code Example:
from scipy.cluster.hierarchy import linkage, dendrogram
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
X, _ = load_iris(return_X_y=True)
Z = linkage(X, method='ward')
dendrogram(Z)
plt.show()
Inertia:
Measures how internally coherent clusters are (lower = better).
Silhouette Score:
Measures how similar a point is to its own cluster vs other clusters (higher = better).
Code Example:
from sklearn.metrics import silhouette_score
score = silhouette_score(X, kmeans.labels_)
print("Silhouette Score:", score)
Output:
Silhouette Score: 0.56
Definition:
Evaluates clustering performance based on intra-cluster similarity and inter-cluster separation.
(Lower value = better clustering.)
Code Example:
from sklearn.metrics import davies_bouldin_score
db_score = davies_bouldin_score(X, kmeans.labels_)
print("Davies-Bouldin Index:", db_score)
Output:
Davies-Bouldin Index: 0.65
Definition:
A model inspired by the human brain that consists of neurons (nodes) arranged in layers.
Diagram:
Input --> [Hidden Layer] --> Output
x1 ----> O ----\
x2 ----> O -----> O ----> y
x3 ----> O ----/
Code Example:
from sklearn.neural_network import MLPClassifier
from sklearn.datasets import load_iris
X, y = load_iris(return_X_y=True)
ann = MLPClassifier(hidden_layer_sizes=(5,), max_iter=1000)
ann.fit(X, y)
print("Accuracy:", ann.score(X, y))
Output:
Accuracy: 0.97
Definition:
A simple neural network with a single neuron that performs binary classification.
Equation:
�=�(∑����+�)
Code Example:
from sklearn.linear_model import Perceptron
from sklearn.datasets import load_iris
X, y = load_iris(return_X_y=True)
p = Perceptron()
p.fit(X, y)
print("Accuracy:", p.score(X, y))
Output:
Accuracy: 0.90
No clear accuracy measure (no labeled data).
Hard to interpret results.
Sensitive to scaling and initialization.
Requires domain knowledge to validate clusters.
Supervised Learning helps when you have labeled data — it’s great for prediction and classification.
Unsupervised Learning is useful when you want to explore and understand hidden patterns in unlabeled data.
Both are fundamental to understanding modern AI and data science applications.