Before Data Transformation Snapshot
This is the original hospital dataset containing attributes like:
Hospital Type, Hospital Ownership, ZIP Code, and various performance metrics. Many fields include "Not Available" or missing values, requiring data cleaning before modeling.
After Data Transformation Snapshot
This table shows the dataset used for regression modeling after preprocessing. It includes:
ZIP_Code: Treated as a numerical feature.
State_UT: A boolean feature indicating whether the hospital is in Utah (True) or Colorado (False), derived via one-hot encoding.
Results
Linear Regression
RMSE: 0.304
R²: 0.036 → Very low explanatory power
Ridge Regression (α = 1000)
RMSE: 0.306
R²: 0.024 → Even lower performance than Linear Regression
Interpretation:
Both models perform poorly, with R² values close to zero.
The features used (ZIP code and state) lack sufficient explanatory power to meaningfully predict hospital count.
More informative variables (e.g., population, urbanization, healthcare demand) are likely needed.