Model Training :
Features and Target: The features ('First Dose Administered' and 'Second Dose Administered') and target ('Total Doses Administered') are defined.
Train-Test Split: The data is split into training and testing sets with a test size of 20% and a random seed for reproducibility.
Feature Scaling: StandardScaler is used to scale the features to have a mean of 0 and a standard deviation of 1. This is essential for regularized regression models like Ridge regression.
Ridge Regression: A Ridge regression model is created and trained on the scaled training data with a regularization strength (alpha) of 0.5.
User Interaction :
A function (`predict_total_doses`) is defined to allow users to input the number of first and second doses administered and get a prediction for the total doses administered.
Model Evaluation :
Mean Squared Error (MSE) : The MSE is extremely low (3.14e-05), indicating that the model's predictions are very close to the actual values on average.
R-squared Score (R^2) : The R-squared score is very high (0.9997), which means that the model explains almost all the variability in the total doses administered.
Root Mean Squared Error (RMSE) : The RMSE is very small (0.0056), indicating that the model's predictions are typically within a very small margin of error.
RMSE as a Percentage of Standard Deviation : The RMSE is only 1.64% of the standard deviation of the target variable, which suggests that the model's predictive performance is excellent relative to the variability in the data.
Overall, the Ridge regression model demonstrates high performance, with a very low RMSE relative to the standard deviation of the target variable, indicating strong predictive accuracy.
code for prediction :
def predict_total_doses():
try:
first_dose = float(input("Enter the number of First Dose Administered: "))
second_dose = float(input("Enter the number of Second Dose Administered: "))
new_data = pd.DataFrame({'First Dose Administered': [first_dose], 'Second Dose Administered': [second_dose]})
predicted_doses = model.predict(new_data)
print(f'Predicted Total Doses Administered: {predicted_doses}')
except ValueError:
print("Invalid input. Please enter numerical values.")
# Call the prediction function
predict_total_doses()
output:
Enter the number of First Dose Administered : 100
Enter the number of Second Dose Administered : 50
Predicted Total Doses Administered : [31.24253365]
Learnings From This Activity :
Engaging in model building activities provides invaluable insights into the intricacies of machine learning and data science. It involves understanding the importance of data preprocessing, feature selection, and the impact of various algorithms on model performance. Through iterative testing and validation, one learns to fine-tune hyperparameters and address issues like overfitting and underfitting. Additionally, it highlights the necessity of cross-validation techniques to ensure model generalizability and reliability. The process emphasizes the critical role of domain knowledge in interpreting results and making informed decisions to improve model accuracy and efficacy. Overall, model building is a holistic exercise that enhances problem-solving skills and technical proficiency in developing robust predictive models.