The assignment deployed four machine learning models and evaluated their performance in predicting purchase intentions. The four models chosen were a Decision Tree, Random Forest, XGBoost and SVM model, each with unique strengths when used for predicting in a classification task. Model evaluation metrics and learning curves were employed to compare performance and identify potential overfitting and underfitting.
Key findings include:
The Random Forest model consistently outperformed the other models achieving the highest score in all metrics
Decision Trees and XGBoost performed relatively well, following very much the same trend, falling slightly short of the Random Forest
The SVM model consistently underperformed indicating that its less suitable for the task
All four models demonstrated a clear learning trend, as seen by the upward slope of the testing score and the convergence of the two curves. However, due to a limited sample size, these curves did not fully converge. This suggests that the models were capable of learning from the data adequately but required a larger sample size to reach optimal performance. The Random Forest model was ultimately chosen due to its superior performance with the evaluation metrics, and a pickle file was created for this model to be utilized within an application.
Recommendations:
The model can be used to accurately predict purchase intentions
The store can target customers with tailored marketing strategies and optimize their marketing efforts
In terms of Soweto, the store should focus on perceived value and product quality
The store should offer competitive pricing to the lower-income residents located in Soweto
The store should tailor the products to the customers needs
Please see the following link for the coding of the above assignment and a more extensive interpretation: