Now we will look at the results of all the experiments we did for predictive data mining.
From the result below, the accuracy for Python is higher than RapdiMiner which is 63.37%. The difference between Python and RapidMiner is not that much; however, Python still predicted more precisely for experiment 1 with a 60:40 ratio.
As for 80:20 ratio, RapidMiner (60.71%) has better performance accuracy than Python (60.08%). If we compare the 60:40 and 80:20 ratios, the 80:20 ratio is slightly better. The class precision for both classes using RapidMiner is also better than the overall precision of the overall prediction in Python.
For the 60:40 ratio, the accuracy for Python is lower than RapidMiner which is 57.41% and 59.67% respectively. Based on the accuracy performance, clearly, RapidMiner has better performance in which RapidMiner was able to predict the health level more precisely than Python.
As we can see from the results below, for 80:20 ratio Python is a better model than RapdiMiner. This is because Python has an accuracy of 59.26% while RapidMiner has an accuracy of 58.68%. Based on the performance result, we can see that Python precision in predicting is higher than RapidMiner.
From the result below, the accuracy of RapidMiner in predicting health level is higher than Python which is 59.88% while Python has an accuracy of 56.68%. Hence, for this experiment and ratio, RapidMiner is the best model.
From the result below, the accuracy for Python is higher than RapidMiner which is 61.73% and 58.68% respectively. The better model is Python in this case.
From the result below, the model accuracy in RapidMiner is the best with 65.12% while Python only has 58.85% of accuracy. Both RapidMiner and Python is precise in predicting Not Healthy.
For 80:20 ratio, still the same as the 60:40 ratio in which RapidMiner is a better model for predicting the health level by using Naive Bayes. If we compare to the result before tuning, RapidMiner clearly has increased a significant percentage but for Python, the accuracy becomes lower after tuning.
After tuning, the accuracy for both Python and RapidMiner are not that much different. RapidMiner is still the best model. Overall this did not reach our expectation to optimize and increase the accuracy of the dataset.
For this ratio, the accuracy of the model still did not reach our expectation. We were expecting the accuracy to increase after doing hyperparameter tuning; however, the accuracy still remained around 59.26% and 58.68% respectively. Based on the results, the best niave bayes model for this experiment is still RapidMiner.
After tuning, the accuracy of both naive bayes in RapidMiner and Python did not change a lot. Once again, this did not meet our expectation. The accuracy of naive bayes model in Python is 56.58% and the model in RapidMiner has an accuracy of 59.88%. For this ratio, RapidMiner is the best model.
For 80:20 ratio, after doing hyperparameter tuning, the model accuracy in RapidMiner does not change that much. However, for Python the performance has dropped significantly from 61.73% to 55.97%. Overall, the best model for this case is RapidMiner since it has an accuracy of 58.68% so it was able to predict more precily than Python.
After running several models, we found out for Naive Bayes the results are not that high for every experiment. So we decided to do hyperparameter tuning to see if the accuracy would get higher. After doing hyperparameter using Grid Search, we detected that the accuracies were mostly gotten lower instead of higher. So, we can see that for Naive Bayes, the models are not suitable to do hyperparameter tuning using Grid Search.