We used the same machine learning model which are Decision Tree and Naive Bayes on all expirements. Based on the figure above, the models’ performance analysis is developed with three dataset split ratios using RapidMiner. While two different performance metrics which are accuracy and classification error were recorded for all model performances. These metrics were chosen as they are most suitable in measuring how well the models work to solve our prediction problem.
Based on the result above, we can see that Random Forest have the highest accuracy which is 100.00% using training and testing ratio of 80:20. Even after we have done all the experiments with different ratios, we can see that Random Forest still maintain its performance even it is slightly decrease after the ratios are changed. While Naive Bayes was lowest at 23.81% using training and testing ratio of 80:20 but in Experiment 2, Naive Bayes's performance of accuracy escalated quickly to 100. This is because in Experiment 2, we have discretize the dataset in order to improve the accuracy of Naive Bayes model. Meanwhile, for Decision Tree, we can also see that its performances were also good for all ratios and experiments but slightly decrease in experiment 2 for ratio 60:40.
In RapidMiner, we can see that all 3 models' performance accuracy decreasing when the training ratio also decrease. Random Forest model has shown the most stable performance of accuracy in those 3 exeriments. Thus, we can conclude that Random Forest is the best model to predict the grades of the students using RapidMiner.The higher the accuracy of a classifier, the lower the value of classification error.