EDM 2025 Half-Day Workshop
In recent years, Educational Data Mining (EDM) has advanced significantly in applying causal inference to improve learning technologies such as intelligent tutoring systems, educational games, and MOOCs. While traditional EDM has focused on predicting learner behaviors and outcomes, there's increasing recognition of the importance of understanding causality—identifying not just what happens, but why it happens.
Causal inference allows researchers to estimate the effects of interventions, uncover mechanisms behind learning outcomes, and design systems that more effectively support student learning. This is vital for both learning science and educational policy, where data-driven decisions rely on understanding underlying causes rather than surface patterns.
The field has seen growth in experimentation on digital platforms, with systems that support live, in-context testing. Methodological innovations have emerged, including ways to model nested data, identify effect heterogeneity, and combine quantitative and EDM techniques to uncover causal relationships. Researchers are also expanding causal work to include quasi- and non-experimental data sources.
This year’s workshop will reflect on these developments, showcase current projects, and open discussion on future challenges—fostering collaboration and generating new ideas for advancing causal inference within EDM.
FULL PAPERS
Improving Automatically Generated Fill-in-the-Blank Answer Selection with an LLM-Based Agreement Filter
Benny G. Johnson, Jeffrey S. Dittel and Rachel Van Campenhout
Stay the Course: Causal Insights from Self-Regulated Learning Patterns in Online Professional Learning
Sophia Soomin Lee, Jacob Dirghalli and Walter L Leite
Enhancing Causal Inference in Educational Data Mining: Integrating Observational and Experimental Data for Greater Precision
Yanping Pei, Adam Sales and Johann Gagnon-Bartsch
LIGHTNING TALKS
Automating the Generation of Common Wrong Answer Feedback Using LLMs
Eamon Worden, Luca Dang, Ashish Gurung, Aaron Haim, Adam Sales and Neil Heffernan
Exploration of methods to analyze a human-AI tutoring intervention
Neha Gupta, Gonzalo Mena, Ashish Gurung, Danielle Thomas, Amelia Haviland and Lee Branstetter
Kirk Vanacore is an Assistant Research Professor of Information Sciences at Cornell University and the Research Director of the National Tutoring Observatory. He integrates statistics, machine learning, and artificial intelligence to uncover causal learning mechanisms in environments that combine human and AI instruction. His recent projects incorporate behavior detection model outputs into principal stratification frameworks to explore how specific student behaviors moderate learning effects. He also investigates the impact of gamification, generative AI feedback, and tutoring interventions on student persistence and self-regulated learning.
Anthony is an Assistant Professor of Educational Technology and computer science education in the College of Education at the University of Florida. His research seeks to impact learning through the blending of learning theory and quantitative methods. Anthony's primary lines of research include the study of student cognition, behavior, and affect, identifying effective learning interventions through causal inference, and developing human-in-the-loop systems and tools to support teachers.
Avery is an Assistant Professor of Emerging Technologies and Learning in the College of Education at the University of Florida. Her research aims to leverage cognitive theory to advance learning technologies and open materials for instructional practice. She specializes in experimental design in the context of learning technologies and explores best practices for methodologies related to this area of research.
Adam is an Assistant Professor of Mathematical Sciences and an affiliate of the Learning Sciences and Technologies and Data Science programs at WPI. His research in applied statistics focuses on methods for causal inference using large, administrative datasets, primarily with applications in learning sciences and social sciences. He has developed and worked on methods combining machine learning with design-based analysis of randomized trials and matched observational studies, principal stratification and mediation analysis using log data from intelligent tutoring systems, and regression discontinuity designs.
Neil is the William Smith Dean's Professor of Computer Science at WPI, the creator of ASSISTments, and an active researcher in the fields of 1) artificial intelligence and education, 2) educational data mining and 3) learning analytics. In order to support research in these fields, Dr. Heffernan created the E-TRIALS Testbed, a tool that allows ASSISTments to be used as a platform to do science and support evidence-based practice. He has dozens of papers in educational data mining, and 20+ papers in comparing different ways to optimize student learning.
Acknowledgment
The research reported here was supported by the Institute of Education Sciences, U.S. Department of Education, through Grant R305N230040 to the University of Florida. The opinions expressed are those of the authors and do not represent views of the Institute or the U.S. Department of Education.