I am a Systems Scientist within the Human-Computer Interaction Institute at Carnegie Mellon University, where I serve as the Research Lead for PLUS—Personalized Learning Squared, a hybrid human-AI tutoring project led by Prof. Ken Koedinger at CMU in collaboration with Carnegie Learning, Inc. and Stanford University.
I also serve as the Director of Research to Practice at The National Tutoring Observatory, a research infrastructure led by Prof. Rene Kizilcec at Cornell University. The Observatory aims to enable researchers to unlock the secrets behind effective teaching by analyzing "a million tutor moves" of data.
My research centers on improving student learning outcomes through: 1) uncovering new insights into teaching and learning, 2) developing effective hybrid human-AI tutoring systems, and 3) leveraging AI to affordably scale learning interventions and enhance tutor and teacher training. Check out my CV, for more on my experiences, education, and publications.
As a former middle school teacher, principal, and teacher educator, my first-hand experiences fuel the mission to greatly improve teaching and learning by focusing on the students who need it the most—the chronic disengagers, the historical strugglers, and the kids that say, "I hate math."
Recent News
August 2025: Paper accepted to AIME-CON proposing multidimensional "ground truth".
July 2025: Appointed Director of Research to Practice at The National Tutoring Observatory.
July 2025: Presented at AIED25 in Palermo on automated assessment using LearnLM.
June 2025: Three papers accepted to ECTEL: Paper 1, Paper 2, and Paper 3.
May 2025: Featured on NBC News highlighting PLUS in schools. Now streaming on Snapchat and YouTube.
April 2025: Organized the Learning at Scale Workshop, taking place July 21 in Palermo, Italy.
March 2025: Interviewed by The Learning Agency: Five Questions with the Developers of PLUS.
March 2025: Presented at LAK25 in Dublin, Ireland: Paper 1 and Paper 2.
March 2025: Paper at the iRAISE Workhsop at AAAI25 in Philadephia.
March 2025: Guest lecturer in Dr. Erin Gatz's Learning About Learning course at Carnegie Mellon.
January 2025: Received research funding from Google DeepMind.
December 2024: Appointed Interim Research Director of The National Tutoring Observatory.
November 2024: Invited to Google's Learning in the AI Era convening in Mountain View, CA.
October 2024: Guest lecturer in Prof. Ken Holstein's Augmenting Intelligence course at Carnegie Mellon.
July 2024: Panelist on Gen erative AI and schools at AIED 2024 in Recife, Brazil.
July 2024: Presented at AIED 2024 on tutoring and students with disabilities.
Humans are biased, inconsistent, and yet we keep trusting them to define “ground truth.” This paper questions the overreliance on inter-rater reliability in educational AI and proposes a multidimensional approach leveraging expert-based approaches and close-the-loop validity to build annotations that reflect impact, not just agreement. It's time we do better. [link]
Danielle R. Thomas, Conrad Borchers, & Kenneth R. Koedinger.
In Artificial Intelligence in Measurment and Education, Oct. 27-29, 2025. Pittsburgh, PA.
Feedback boosts learning—but can AI do the same? In over 2,600 lessons with delayed corrective feedback for all, learners who engaged with LLM-generated feedback saw modest gains without spending extra time, showing LLMs can support learning. [link]
Danielle R. Thomas, Conrad Borchers, Shambhavi Bhushan, Erin Gatz, Shivang Gupta, & Kenneth R. Koedinger.
In The 20th European Conference on Technology Enhanced Learning, Sept. 15-19, 2025. Durham and Newcastle, UK.
We fine-tuned GPT-4o to detect LLM-generated short answers in online lessons, outperforming GPTZero and a stylometric baseline. Learners flagged for misusing LLMs performed significantly better on posttests, suggesting AI-generated responses may assist learners in bypassing meaningful learning. [link]
Shambhavi Bhushan, Danielle R. Thomas, Conrad Borchers, Isha Raguvanshi, Ralph Abboud, Erin Gatz, Shivang Gupta, & Kenneth R. Koedinger.
In The 20th European Conference on Technology Enhanced Learning, Sept. 15-19, 2025. Durham and Newcastle, UK.
We evaluate the feasibility of using LLMs to assess tutor behaviors in real-world math tutoring transcripts. Multiple models reliably detected and evaluated key tutor moves—effective praise and error response—with high alignment to human judgments, suggesting promise for scalable, low-cost tutor assessment. [link]
Danielle R. Thomas, Conrad Borchers, Jionghao Lin, Sanjit Kakarla, Shambhavi Bhushan, Ralph Abboud, Erin Gatz, Shivang Gupta, & Kenneth R. Koedinger.
In The 20th European Conference on Technology Enhanced Learning, Sept. 15-19, 2025. Durham and Newcastle, UK.
LearnLM, a fine-tuned model trained on pedagogical data, outperforms general models like GPT-4o at grading tutor responses in realistic tutoring scenarios. Given notarious ambiguity using human scores as "ground truth," we introduce a clever predictive validity method for establishing truth rethinking assessment—no red pens required! [link]
Danielle R. Thomas, Conrad Borchers, Sanjit Kakarla, Shambhavi Bhushan, Alex Houk, Shivang Gupta, Erin Gatz, & Kenneth R. Koedinger.
In The 26th International Conference on Artificial Intelligence in Education, July 21-25, 2025. Palermo, Italy (2025).
MCQs often get a bad rap but is it justified? Here we find no differences in learning among those engaging with MCQs, open responses, or both--but MCQs are faster to complete. Despite using GenAI to automatically grade open responses, we don't plan on getting rid of MCQs just yet! Dataset and AI prompts included to try it yourself. Let's support open science! [link]
Danielle R. Thomas, Conrad Borchers, Sanjit Kakarla, Jionghao Lin, Shambhavi Bhushan, Boyuan Guo, Erin Gatz, & Kenneth R. Koedinger.
In The 15th International Learning Analytics and Knowledge Conference (AIED), March 3-8, 2025, Dublin, Ireland (2025)
Human tutors need training on how to support students in math, but what about helping tutors in attending to equity? Here we use GenAI to assess tutors equity skills showing pre- to post-test gains. Dataset and support materials included. [link]
Danielle R. Thomas, Conrad Borchers, Sanjit Kakarla, Jionghao Lin, Shambhavi Bhushan, Boyuan Guo, Erin Gatz, & Kenneth R. Koedinger
In The 15th International Learning Analytics and Knowledge Conference, March 3-8, 2025, Dublin, Ireland (2025)
Prof. Koedinger and I received funding from Google DeepMind to use generative AI to assess tutors in training and tutoring
I attended Google's Learning in the AI Era event embracing the importance of curiosity, collaboration, and critical thinking in the AI age
As a panelist at AIED24 in Recife, Brazil, I discussed school perspectives and future classroom use of AI in schools
We conduct a two-study quasi-experiment to determine the impact of hybrid human-AI tutoring among students with and without disabilities in general education classrooms. We find positive effects among students. In particular, students with disabilities may benefit more from the motivational benefits of human tutor interaction. [link]
Danielle R. Thomas, Erin Gatz, Shivang Gupta, Vincent Aleven, & Kenneth R. Koedinger
In The 25th Artificial Intelligence in Education (AIED) Conference, July 7-13, 2024, Recife, Brazil (2024)
How do you respond to students saying, "I am dumb" or "I can't do this." We (and generative AI!) assess the performance of 60 tutors within an online lesson on responding to students engaging in negative self-talk. We find evidence of tutor learning, with GPT-4 demonstrating high absolute performance. This LLM assessment system can easily scale from 60 to 600 tutors, making it a game-changer for evaluating human tutors at scale. [link]
Danielle R. Thomas, Jionghao Lin, Shambhavi Bhushan, Ralph Abboud, Erin Gatz, Shivang Gupta, & Kenneth R. Koedinger
In The 11th ACM Conference on Learning @ Scale (L@S), July 18-20, 2024, Altanta, Georgia (2024)
This work overviews the progress of the PLUS project towards using generative AI for tutoring feedback and assessment. While using generative AI shows promise as a low-cost and efficient method for these uses, ethical considerations and practical implications are discussed to ensure fair and responsible use. [preprint] [presentation]
Danielle R. Thomas, Erin Gatz, Shivang Gupta, Jionghao Lin, Cindy Tipper, & Kenneth R. Koedinger
In The 17th Annual Learning Ideas Conference, June 12-14, 2024, New York, NY (2024)
We introduce hybrid human-AI tutoring and implement the model across three diverse schools. We find positive impacts on learning outcomes with evidence suggesting lower achieving students may benefit more from tutoring than higher achieving students—a promising finding. [link]
Danielle R. Thomas, Jionghao Lin, Erin Gatz, Ashish Gurung, Shivang Gupta, Kole Norberg, Stephen E. Fancsali, Vincent Aleven, Lee Branstetter, Emma Brunskill, Kenneth R. Koedinger
In The 14th Learning Analtyics and Knowledge (LAK) Conference, March 18-22, 2024, Kyoto, Japan (2024)
In this systematic review, we determine the average STEM student outperforms ~70% of their peers. Most notably, underrepresented minority students benefit given one caveat—they must be given the opportunity. [Journal article link]
Danielle R. Thomas & Karen H. Larwin
International Journal of STEM Education (2023)
This workshop highlights the challenges and opportunities of AI-in-the-loop math tutoring and encourages discourse in the AIED community. Access papers and presentations here.
Vincent Aleven, Richard Baraniuk, Emma Brunskill, Scott Crossley, Dora Demszky, Stephen Fancsali, Shivang Gupta, Kenneth R. Koedinger, Chris Piech, Steve Ritter, Danielle R. Thomas, Simon Woodhead, Wanli Xing
In The 24th Artificial Intelligence in Education (AIED )Conference, July 3-7, 2023, Tokyo, Japan (2023)
We introduce Personalized Learning Squared (PLUS), a human-AI tutoring platform designed to improve tutoring efficiency. PLUS leverages student-facing AI-powered math software and a tutor-facing personalized dashboard to provide the right support, to the right student, and at the right time.
Danielle R. Thomas, Shivang Gupta, Erin Katz, Cindy Tipper, Kenneth R. Koedinger
in 16th Annual Learning Ideas Conference, NYC (2023)
We introduce a method of providing explanatory feedback to human tutors on their responses to open-ended questions leveraging LLMs using named entity recognition.
Jionghao Lin, Danielle R. Thomas, Feifei Han, Shivang Gupta, Wei Tan, Ngoc Dang Nguyen, Kenneth R. Koedinger
Workshop at 24th Artificial Intelligence in Education (AIED) Conference (2023)
Towards the Future of AI-Augmented Human Tutoring in Math Learning
We compare the performance of humans and GPT-4 in identifying criteria of praise by tutors to students. GPT-4 performs moderately well is some areas but underperforms in recognizing sincerity and authenticity- not surprising, yet paves the way for future work.
Dollaya Hirunyasiri, Danielle R. Thomas, Jionghao Lin, Kenneth R. Koedinger, Vincent Aleven
Workshop at 24th Artificial Intelligence in Education Conference (2023) Towards the Future of AI-Augmented Human Tutoring in Math Learning
We introduce an AI-based method of autograding online tutor lessons. Comparing two methods of training set creation using learnersourced tutor responses and by prompting ChatGPT. Our findings show a constructive use of ChatGPT for pedagogical purposes that is not without limitations. [Video presentation]
Danielle R. Thomas, Shivang Gupta, Kenneth R. Koedinger
In The 24th Artificial Intelligence in Education Conference, July 3-7, 2023, Tokyo, Japan (2023)
We show tutors perform ~20% better from pretest to posttest on our short scenario-based lessons similar to situational judgment tests. How would you respond to a student who has just made a math error?
Danielle R. Thomas, Xinyu Yang, Shivang Gupta, Adetunji Adeniran, Elizabeth McLaughlin, Kenneth R. Koedinger
In The 13th International Learning Analytics & Knowledge Conference, Austin, TX (2023)
Comparing the achievement of 70 students participating in a hybrid tutoring program compared to a matched control, we found the learning gain among participating students was nearly double that of students not participating.
Danielle R. Chine, Cassandra Brentley, Carmen Thomas-Browne, J. Elizabeth Richey, Abdulmenaf Gul,... Kenneth R. Koedinger
In The 23rd Artificial Intelligence in Education Conference, Durham, UK (2022)