Colorado vs Utah

Kmeans Clustering
- Hospital Data -

Before Data Transformation Snapshot
This is the original hospital dataset containing attributes like:

- Hospital Type, Hospital Ownership, ZIP Code, and various performance metrics. Many fields include "Not Available" or missing values, requiring data cleaning before modeling.

After Data Transformation Snapshot
- This table shows the dataset after applying standardization using StandardScaler. All features (e.g., ZIP code, hospital count) are scaled to have mean 0 and unit variance—essential for clustering models like KMeans to treat features equally.

Results.

The top plot shows Silhouette Scores for different cluster counts.
→ The best result is at k=5, with a high score (~0.904), indicating well-separated clusters.
The bottom plot shows a PCA-reduced 2D visualization of ZIP-level hospital data.
→ Clusters are distinct and partially aligned with state boundaries (CO vs. UT), suggesting regional patterns in hospital distribution.

Page updated

Report abuse