Research

My research is primarily concerned with applying machine learning and probabilistic modeling techniques to questions in education and cognitive science. My past research has included work with the Stanford Natural Language Processing Group, and I'm happy to chat with students about their interests in NLP. I typically recruit students to work with me in the summer during the winter term, and at the end of some terms, I send out an email recruiting students to help with research in the following term. Carleton students have been involved in each of the projects below, often contributing to activities like running and analyzing simulations, programming new experiments, and developing variations of machine learning algorithms and exploring their properties.

Research

Educational assessment using machine learning

Diagnosing understanding from observed actions

Optimizing games and activities for assessment

Formative guidance and interactive modules to promote students' science understanding

Automatically improving and personalizing online educational resources

Educational assessment using machine learning

Every year, significant amount of school time for elementary, middle, and high school students is spent on assessment - typically, taking standardized tests. Ideally, this time on assessment should be useful to students and teachers: if assessment results provided teachers with timely information about where their students were struggling, then teachers could target those areas during classroom instruction. But too often, assessment results fail to meet those goals. Results are often not very specific about where students are struggling, with each assessment giving only a little information about students' knowledge on a wide variety of topics. Further, scoring of assessments is often constrained to use only information about whether a student is correct or incorrect, ignoring factors like which incorrect answer a student chose and limiting the types of questions that are asked. Machine learning (ML) approaches have been proposed to try to address some of these challenges. ML models can capture patterns across student responses, potentially providing finer-grained information about their knowledge compared to traditional scoring approaches. ML models also have the potential to predict student knowledge based on the strategies they take to solve problems, opening up the possibility of more open-ended and authentic assessments.

In this project, we've focused on building machine learning models that are more flexible than traditional assessment models but still have the statistical properties needed for assessment, such as being able to report how much certainty they have in their predictions. I'm working with students and collaborators to investigate questions like:

How can we incorporate more information about question context and student behavior to make predictions about questions that we haven't previously observed?
How can we quantify the degree to which the model treats different groups of students similarly, and what is the appropriate measure of fairness in the context of educational assessment?
How can we combine information from a student over time to track student learning?

For more information about this work, take a look at the following paper:

S. Thomas Christie, Hayden Johnson, Carson Cook, Garron Gianopulos, and Anna N. Rafferty. 2023. LENS: Predictive Diagnostics for Flexible and Efficient Assessments. In Proceedings of the Tenth ACM Conference on Learning @ Scale (L@S '23). Association for Computing Machinery, New York, NY, USA, 14–24. https://doi.org/10.1145/3573051.3593392 [PDF]

Diagnosing understanding from observed actions

Teachers can easily make inferences about what students know and in what ways they misunderstand by observing how students solve problems and complete activities. In this project, I'm interested in how a computer can draw the same inferences automatically. I'm specifically interested in interpreting actions in complex digital environments like games and virtual labs, linking them to misunderstandings that a student may have. The algorithm I've been developing uses a variation of inverse reinforcement learning to infer people's beliefs or understanding of their actions based on their choices.

The current focus of this project is on algebraic equation solving: how can we diagnose specific algebra misunderstandings based on a student's process of solving an equation? I developed Emmy's Workshop, a website that collects step-by-step problem solving data from which we can diagnose understanding automatically and deliver customized feedback to students. Make an account at Emmy's Workshop to try it out! Future research will examine some of the following questions:

What's the right representation for the space of possible misunderstandings students might have? How can we represent this efficiently and conduct computations over this space? How can we induce this space automatically from observations of student problem solutions?
How does student knowledge change when students complete an intervention targeted at [what we infer to be] their specific misunderstandings versus completing other types of interventions?
How does changing the level of abstraction in our representation of algebraic equation solving effect efficiency of computation and accuracy of diagnosis?

For more information about our inverse reinforcement learning work or to read about the algebra project, take a look at the following papers:

Rafferty, A. N., Jansen, R. A., & Griffiths, T. L. (2016) Using Inverse Planning for Personalized Feedback. Proceedings of the 9th International Conference on Educational Data Mining (pp. 472-477). [PDF]

Rafferty, A. N. and Griffiths, T. L. (2015). Interpreting freeform equation solving. Proceedings of the 17th International Conference on Artificial Intelligence in Education. [PDF]

Rafferty, A. N., LaMar, M. M., & Griffiths, T. L. (2015). Inferring learners' knowledge from their actions. Cognitive Science, 39, 584-618. [PDF]

Emmy's Workshop uses machine learning to examine how learners solve equations.

Optimizing games and activities for assessment

Games are motivating and engaging, providing exciting educational opportunities and new vehicles for behavioral experiments. However, designing games that are effective for assessment is frequently time consuming, requiring considerable trial and error. How can we automatically choose a design for a game so that players' actions will be most diagnostic about their beliefs? The approach we take is to search through a space of possible game designs to find versions that will give the most information about a person's beliefs. The resulting game might be used as an educational assessment or as a tool for cognitive science research; in either case, the optimized design will result in more efficient diagnosis. Recently, we've also been applying this approach to designing assessments of equation solving that can more quickly recognize what someone misunderstands. To learn more, check out the following:

Rafferty, A. N., Zaharia, M., & Griffiths, T. L. (2014). Optimally designing games for behavioural research. Proceedings of the Royal Society Series A, 470. (pdf)

Formative guidance and interactive modules to promote students' science understanding

One of the benefits of building educational technologies is the potential to deliver customized guidance to students. I've worked on several projects related to building online science learning environments that provide students with hints and other support to help them learn chemistry. These projects have combined ideas from educational design, HCI, and AI. One part of this work was conducted with the WISE team at Berkeley, and another part was conducted with researchers at WestEd. To learn more, check out the following:

McCormick, S., Davenport, J. L., Rafferty, A. N., Raysor, S., Yani, J., & Yaron, D. (2023). ChemVLab+: Integrating Next Generation Science Standards Practices with Chemistry. Journal of Chemical Education, 100(6), 2116–2131. doi:10.1021/acs.jchemed.2c01106

Gerard, L. F., Ryoo, K., McElhaney, K. W., Liu, O. L., Rafferty, A. N., & Linn, M. C. (2016). Automated Guidance for Student Inquiry. Journal Of Educational Psychology, 108(1), 60-81. doi:10.1037/edu0000052.

Linn, M. C., Gerard, L. F., Ryoo, K., Liu, L., & Rafferty, A. N. (2014). "Computer-guided inquiry to improve science learning." Science, 344: 155-156. [PDF]

Rafferty, A. N., Gerard, L., McElhaney, K., and Linn, M. C. (2014) “Promoting Student Learning Through Automated Formative Guidance on Chemistry Drawings.” Proceedings of the International Conference of the Learning Sciences (ICLS) 2014 (p. 386-393). [PDF]

Automatically improving and personalizing online educational resources

There's an increasing number of online educational resources, but once they're created, they're often static - there's no continuing process to try to improve them. In this project, I'm interested in how we can use bandit algorithms, which are often used for targeting ads, to combine exploring versions of educational content and exploiting what we learn to more frequently show learners those versions which are more effect. This research has involved both applied projects in which we test out whether our algorithms are effective and whether our interface can bring together educational content designers and researchers, as well as more theoretical work exploring the properties of bandit algorithms. This work is a joint project with researchers at University of Toronto and other institutions. To learn more, check out the following:

Li, Z., Yee, L., Sauerberg, N., Sakson, I., Williams, J. J., & Rafferty, A. N. (2020). Getting too personal(ized): The importance of feature choice in online adaptive algorithms. Proceedings of the 13th International Conference on Educational Data Mining (pp. 159-170). [PDF][Link to code repository]

Rafferty, A. N., Ying, H., & Williams, J. J. (2019). Statistical consequences of using multi-armed bandits to conduct adaptive educational experiments. Journal of Educational Data Mining, 11(1): 47-79. [PDF]

Williams, J. J., Kim, J., Rafferty, A. N., Maldonado, S., Gajos, K. Z., Lasecki, W. S., & Heffernan, N. (2016). AXIS: Generating Explanations at Scale with Learnersourcing and Machine Learning. Proceedings of the Third (2016) ACM Conference on Learning@ Scale. [PDF]

Example explanation written by a student, and a rating interface for another student to indicate its helpfulness.