Testing the Testing Effect

Taking the Testing Effect from the lab into the classroom :

a way to discover the benefits through field-research with students.

The Testing Effect and Retrieval Practice

The testing effect if the finding that learners learn better when using testing as a practice strategy rather than using rehearsal. This idea is backed by dozens of scientific studies in the lab and also in the classroom, using variety of learning materials, across settings, subjects and ages (e.g. 1). It was translated into a strategy called Retrieval Practice, in which learners practice by effortfully recalling information , rather than re-reading it.

Research in Classrooms

The experiment described here was designed for an online graduate class in education studies and performed prior to starting a unit on research-informed effective practice. The first goal was to introduce the testing effect, by letting the students experience the long-term consequences of learning the same material using different strategies. A second goal was to introduce behavioral experiments in cognitive sciences, as basis for further discussion of benefits, limitation and possible implication for classroom practice.

The inspiration

The basic inspiration was Karpicke and Roediger's study of the testing effect (2) published in science in 2008. In this classical study participants learned words in Swahili and their English translations. Four different groups studied the 40 words in one session and and came back for a memory test one week later. The learning session included alternating rounds of Study trials (Word + Translation) and Test trials (word presented, participants requested to type translation). The 4 groups had different amounts of Study vs. Test trials. The results showed that more Test trials at the study session led to much better memory performance after one week (see more here). Another finding was that when participants were asked to predict their memory performance in 1 week, they were not able to predict accurately. Specifically those who restudied overestimated their performance and those who re-tested underestimated their performance. This important issue was addressed in another study by the same authors from 2006, that explored the source of inaccurate performance estimation (3) and also in this presentation.

The Experiment

The study above was adjusted and modified to answer the goals. One important consideration was to allow all the participating students to experience learning by both learning strategies. To do that a within-subject design was chosen, where each participant learns words in two different methods. This within-subject design may prove very useful in educational context as was explained here.

15 Students participated, each of them learned 24 Swahili word- English translation pairs (selected from the original study). Twelve words were learned in a Study mode, and twelve were learned in a Test mode, throughout the learning phase. Both modes included typing the English translation and reviewing the correct translation immediately afterwards (see figure). The main difference was whether the translations were presented or participants were requested to recall them.


The colors in the figure correspond to the trial type, Study or Test as depicted in the figure above.


The learning phase included 5 consecutive rounds, all the 24 words repeated in all rounds. The 12 Study words appeared in Study format in all rounds, The 12 Test words appeared in study format for the first round and as Test in rounds 2-5.

One week later, the participants returned for a test where they attempted to recall and type the translations for all words.

Results

What can be measured? In every point throughout the experiments where the participants were required to recall the translations (turquoise rectangles in the figure above), there is an opportunity to assess their memory performance. Hence, we can follow the average progress (for the Test words) from round to round in the learning phase (left) and also compare the average performance for Study vs. Test after 1 week (right).


















N=15Error Bars represent Standard Error of the Mean, Paired t-test for Study vs. Test yielded p<0.001

we can see a steady improvements across the learning phase, reaching an average of 80% of the words in the last round.

After 1 week it is clear that the participants remembered more Test than Study words. (it is intersting to compare with the study that inspired this experiment: click here)


Do learners know how effective their practice was?

At the end of the learning phase the participants were asked to predict their performance for both Test and Study words in 5 minutes and 1 week. You can see the responses summarized in these figures:

  • Note that participants estimated on average 62% for test words after 5 minutes, even though they just performed an average of 80%.

  • For the long-term, participants were quite accurate about their memory for the Study words (31% vs. 32%) but underestimated their memory performance of the Test words (40% vs. 56%).

This may suggest that learners cannot see the benefits of retrieval practice at the time of learning. This finding was evident also in the original studies (2,3).



Conclusions - cognitive research in the classroom

Participating in this experiment allowed the students to experience the often surprising effects of testing by themselves. They could (and were asked to) reflect on their own experience, evaluating the contributing factors. In addition, they saw the data averaged across the entire group, and how the patterns closely replicated scientific findings. This connection between the individual and collective experience support the realization that some factors, like the testing effect, have an impact on learning that goes beyond the individual differences and incidental factors. I believe this realization is powerful for both students and educators, and may be key in their willingness to adopt insights from the cognitive sciences.

Another benefit is that learners can get a better first-hand idea of what a behavioral experiment means, and how it is conducted. This helps to better appreciate the meaning of other behavioral experiments, to appreciate their value on one hand and to be critical on the other.

Last, there are many ways to bring behavioral experiments into classrooms of all types. It can be just a short demo, a compete experiment as in this case, and even a full project where the learners take part as the experimenters, partnering in the entire process, from designing, to running, analyzing and presenting. The approach is chosen according to the teaching goals, and each has value in bridging the cognitive science with teaching and learning in classrooms. In my experience, with both middle schooners and higher-ed students, it is a good way to induce motivation to change practices as it combines the intellectual challenge with the personal and often emotional experience. I will always remember a group of middle school students, avid re-readers and highlighters, skeptical at the beginning of a full semester testing-effect research project, asking me enthusiastically upon getting the results of their own experiment "if THEY know all this, why don't they tell US ?"

Wish to try?

If you are interested in trying the same experiment in your own teaching context, I'm happy to share the files or discuss other possibilities (contact details), if you just want to try it for yourself or let someone else (who doesn't know the results in advance) try, click here.

Another direction is to use platforms that provide ready-to-use cognitive experiments, Go Cognitive is a great one (it allows downloading the nonperformance data).

References

1. Agarwal, P. K., & Roediger III, H. L. (2018). Lessons for learning: How cognitive psychology informs classroom practice. Phi Delta Kappan, 100(4), 8-12.‏ [PDF]

2. Karpicke, J. D., & Roediger, III, H. L. (2008). The critical importance of retrieval for learning. science, 319(5865), 966-968.‏ [PDF]

3. Roediger III, H. L., & Karpicke, J. D. (2006). Test-enhanced learning: Taking memory tests improves long-term retention. Psychological science, 17(3), 249-255.‏ [PDF]

Published: December 2018