Bulut, O., Wongvorachan, T., He, S., Lee, S. (2024). Enhancing high-school dropout identification: A collaborative approach integrating human and machine insights. Discover Education, 3(109).
This study highlights the advantages of human–machine collaboration for improving the efficiency and accuracy of decision-making processes in educational settings.
We utilized machine learning models and explainable artificial intelligence technique to identify influencing factors to high school students' dropout prediction such as their prior GPA, sense of school belonging, perception of science utility, and interest in mathematics.
Wongvorachan, T., Srisuttiyakorn, S., & Sriklaub, K. (2024). Optimizing learning: Predicting research competency via statistical proficiency. Trends in Higher Education, 3(3), 540-559.
This study utilized the supervised machine learning approach to predict students’ research competency, represented by their performance in a research methods class, with their proficiency in statistical topics as predictors.
Results indicate that the three primary categories of statistical skills—namely, the understanding of statistical concepts, proficiency in selecting appropriate statistical methods, and statistics interpretation skills—can be used to predict students’ research competency.
Wongvorachan, T., Bulut, O., Liu, J.X., Mazzullo, E. (2024). A comparison of bias mitigation techniques for educational classification tasks using supervised machine learning. Information, 15(6), 326.
This research evaluates the effectiveness of four bias mitigation techniques in an educational dataset aiming at predicting students’ dropout rate.
The ROC pivot technique reduced predictive bias to the acceptable threshold while maintaining the original performance of the classifier, thereby emerging as the optimal method for the High School Longitudinal Study of 2009 dataset.
Poth, C. N., Wongvorachan, T., Bulut, O., & Otto, S. J. G. (2024). Adaptive Case Study-Mixed Methods Design Practices for Researchers Studying Complex Phenomena. Journal of Mixed Methods Research, 18(3), 292-303.
The methodological purpose of this article is to generate practical guidance for researchers studying complex phenomena through an adaptive case study-mixed methods (CS-MM) design.
The combination of text mining, descriptive statistics, document review, and thematic analysis under the CS-MM design allows us to capture the characteristic of public health communication during wave 1 to 4 in the complex and rapidly changing phenomena of the Covid-19 pandemic.
Wongvorachan, T., He, S., Bulut, O. (2023). A comparison of undersampling, oversampling, and SMOTE methods for dealing with imbalanced classification in educational data mining. Information, 14(1), 54.
This study compared several sampling techniques to handle the different ratios of the class imbalance problem (i.e., moderately or extremely imbalanced classifications) using the High School Longitudinal Study of 2009 dataset.
For our comparison, we used random oversampling (ROS), random undersampling (RUS), and the combination of the synthetic minority oversampling technique for nominal and continuous (SMOTE-NC) and RUS as a hybrid resampling technique.
Our results show that random oversampling for moderately imbalanced data and hybrid resampling for extremely imbalanced data seem to work best.