Identify patient subgroups with similar metabolic profiles
Understand different risk pathways to diabetes
Complement supervised classification by adding clinical insight
Enable personalized healthcare recommendations
We analyzed both the Elbow Method and Silhouette Scores. The Elbow Method revealed a sharp decline in inertia from k=2 to k=3, followed by a slower decrease, suggesting diminishing returns beyond k=3 or k=4.
Cluster 0: 321 patients → 8.10% with diabetes (Low Risk)
Cluster 1: 239 patients → 53.14% with diabetes (High Risk)
Cluster 2: 208 patients → 55.29% with diabetes (Highest Risk)
Cluster 0 (Blue – Low Risk)
Forms a dense, well-separated group, especially in the t-SNE plot, indicating high internal similarity and a distinct metabolic profile. This supports its identification as a healthy, low-risk cluster.
Cluster 1 (Orange – High Risk)
Clearly separated from Cluster 0 in both PCA and t-SNE, showing a distinct set of characteristics likely linked to older age, higher glucose, and BMI.
Cluster 2 (Green – High Risk)
Shows some overlap with Cluster 1, particularly in PCA, but remains distinguishable. This suggests shared risk features (e.g., high glucose/insulin), but also enough unique traits (e.g., age or genetics) to form a separate high-risk subgroup.
HC clusters show a very similar spatial pattern to K-Means using the same principal components (PC1: 31.28%, PC2: 18.37%).
HC Cluster 0 (Blue) aligns closely with K-Means Cluster 0 — the low-risk group — appearing distinctly on the left side of the PCA plot.
HC Cluster 1 (Orange) and HC Cluster 2 (Green) mirror the positions of K-Means Clusters 2 and 1, respectively.
Clear separation is observed for one cluster, while the other two show some overlap, reflecting similar subgroup structures.
Both methods agree on:
One clear low-risk group (Cluster 0 in both)
Two distinct high-risk clusters, reflecting clinical variation
K-Means: More effective at separating healthy individuals
Hierarchical Clustering: More effective at grouping diabetics into a focused risk cluster