RESULT AND ANALYSIS

Naive Bayes

Now we will look at the results of all the experiments we did for predictive data mining.

Before Tuning

Experiment 1: Resampling only

60:40 Ratio

From the result below, the accuracy for Python is higher than RapdiMiner which is 63.37%. The difference between Python and RapidMiner is not that much; however, Python still predicted more precisely for experiment 1 with a 60:40 ratio.

Python

RapidMiner

80:20 Ratio

As for 80:20 ratio, RapidMiner (60.71%) has better performance accuracy than Python (60.08%). If we compare the 60:40 and 80:20 ratios, the 80:20 ratio is slightly better. The class precision for both classes using RapidMiner is also better than the overall precision of the overall prediction in Python.

Python

RapidMiner

Experiment 2: Resampling and Standardization

60:40 Ratio

For the 60:40 ratio, the accuracy for Python is lower than RapidMiner which is 57.41% and 59.67% respectively. Based on the accuracy performance, clearly, RapidMiner has better performance in which RapidMiner was able to predict the health level more precisely than Python.

Python

RapidMiner

80:20 Ratio

As we can see from the results below, for 80:20 ratio Python is a better model than RapdiMiner. This is because Python has an accuracy of 59.26% while RapidMiner has an accuracy of 58.68%. Based on the performance result, we can see that Python precision in predicting is higher than RapidMiner.

Python

RapidMiner

Experiment 3: Resampling, Standardization and Discretization

60:40 Ratio

From the result below, the accuracy of RapidMiner in predicting health level is higher than Python which is 59.88% while Python has an accuracy of 56.68%. Hence, for this experiment and ratio, RapidMiner is the best model.

Python

RapidMiner

80:20 Ratio

From the result below, the accuracy for Python is higher than RapidMiner which is 61.73% and 58.68% respectively. The better model is Python in this case.

Python

RapidMiner

After Tuning

Experiment 1: Resampling only

60:40 Ratio

From the result below, the model accuracy in RapidMiner is the best with 65.12% while Python only has 58.85% of accuracy. Both RapidMiner and Python is precise in predicting Not Healthy.

Python

RapidMiner

80:20 Ratio

For 80:20 ratio, still the same as the 60:40 ratio in which RapidMiner is a better model for predicting the health level by using Naive Bayes. If we compare to the result before tuning, RapidMiner clearly has increased a significant percentage but for Python, the accuracy becomes lower after tuning.

Python

RapidMiner

Experiment 2: Resampling and Standardization

60:40 Ratio

After tuning, the accuracy for both Python and RapidMiner are not that much different. RapidMiner is still the best model. Overall this did not reach our expectation to optimize and increase the accuracy of the dataset.

Python

RapidMiner

80:20 Ratio

For this ratio, the accuracy of the model still did not reach our expectation. We were expecting the accuracy to increase after doing hyperparameter tuning; however, the accuracy still remained around 59.26% and 58.68% respectively. Based on the results, the best niave bayes model for this experiment is still RapidMiner.

Python

RapidMiner

Experiment 3: Resampling, Standardization and Discretization

60:40 Ratio

After tuning, the accuracy of both naive bayes in RapidMiner and Python did not change a lot. Once again, this did not meet our expectation. The accuracy of naive bayes model in Python is 56.58% and the model in RapidMiner has an accuracy of 59.88%. For this ratio, RapidMiner is the best model.

Python

RapidMiner

80:20 Ratio

For 80:20 ratio, after doing hyperparameter tuning, the model accuracy in RapidMiner does not change that much. However, for Python the performance has dropped significantly from 61.73% to 55.97%. Overall, the best model for this case is RapidMiner since it has an accuracy of 58.68% so it was able to predict more precily than Python.

Python

RapidMiner

After running several models, we found out for Naive Bayes the results are not that high for every experiment. So we decided to do hyperparameter tuning to see if the accuracy would get higher. After doing hyperparameter using Grid Search, we detected that the accuracies were mostly gotten lower instead of higher. So, we can see that for Naive Bayes, the models are not suitable to do hyperparameter tuning using Grid Search.

Page updated

Report abuse

RESULT AND ANALYSIS

Naive Bayes

Before Tuning

Experiment 1: Resampling only

60:40 Ratio

Python

RapidMiner

80:20 Ratio

Python

RapidMiner

Experiment 2: Resampling and Standardization

60:40 Ratio

Python

RapidMiner

80:20 Ratio

Python

RapidMiner

Experiment 3: Resampling, Standardization and Discretization

60:40 Ratio

Python

RapidMiner

80:20 Ratio

Python

RapidMiner

After Tuning

Experiment 1: Resampling only

60:40 Ratio

Python

RapidMiner

80:20 Ratio

Python

RapidMiner

Experiment 2: Resampling and Standardization

60:40 Ratio

Python

RapidMiner

80:20 Ratio

Python

RapidMiner

Experiment 3: Resampling, Standardization and Discretization

60:40 Ratio

Python

RapidMiner

80:20 Ratio

Python

RapidMiner

© 2021 by 202333 Project Portfolio