Colorado vs Utah

Regression (Linear & Ridge)
- Hospital Data -

Before Data Transformation Snapshot
This is the original hospital dataset containing attributes like:

- Hospital Type, Hospital Ownership, ZIP Code, and various performance metrics. Many fields include "Not Available" or missing values, requiring data cleaning before modeling.

After Data Transformation Snapshot
- This table shows the dataset used for regression modeling after preprocessing. It includes:

ZIP_Code: Treated as a numerical feature.
State_UT: A boolean feature indicating whether the hospital is in Utah (True) or Colorado (False), derived via one-hot encoding.

Results

- Linear Regression
  - RMSE: 0.304
  - R²: 0.036 → Very low explanatory power
- Ridge Regression (α = 1000)

RMSE: 0.306
R²: 0.024 → Even lower performance than Linear Regression

Interpretation:
- Both models perform poorly, with R² values close to zero.
- The features used (ZIP code and state) lack sufficient explanatory power to meaningfully predict hospital count.
- More informative variables (e.g., population, urbanization, healthcare demand) are likely needed.

Page updated

Report abuse