Before Data Transformation Snapshot
This is the original hospital dataset containing attributes like:
Hospital Type, Hospital Ownership, ZIP Code, and various performance metrics. Many fields include "Not Available" or missing values, requiring data cleaning before modeling.
After Data Transformation Snapshot
This table shows the dataset after applying standardization using StandardScaler. All features (e.g., ZIP code, hospital count) are scaled to have mean 0 and unit variance—essential for clustering models like KMeans to treat features equally.
Results.
The top plot shows Silhouette Scores for different cluster counts.
→ The best result is at k=5, with a high score (~0.904), indicating well-separated clusters.
The bottom plot shows a PCA-reduced 2D visualization of ZIP-level hospital data.
→ Clusters are distinct and partially aligned with state boundaries (CO vs. UT), suggesting regional patterns in hospital distribution.