This page is based on my talk, at ResearchED Home 2020, and it is presented in a talk-like format.
Prediction is a basic cognitive function in our life and specifically in learning. When interacting with the environment, we, as well as other species, make predictions and act according to our expectations. Then, the outcome is either as expected and confirms our existing internal models, or it is unexpected and calls for an update. This step is crucial for learning and allows us to make better predictions in the future.
There is solid evidence that prediction plays a role in simpler types of learning, like (but not only) classical and operant conditioning. These types of learning are traditionally classified as non-declarative learning as we don’t have to be conscious about the process of learning and we cannot 'declare' what we have learned. These ideas about prediction are extended today even to AI research and is quite a “hot topic” (1), however, we are going to ask a different (although not completely unrelated) question:
Declarative knowledge is about learning of facts and events, the type of learning that takes place in the classroom, when we construct knowledge explicitly and consciously.
We focus here on the process of prediction in declarative learning specifically. However, in the background we will ask what are the differences and similarities between the role of prediction in these apparently different systems.
Thinking about prediction in that context raises some interesting questions:
Is prediction a natural process that learners use intuitively?
Do some students predict more than others? Are these the more curious students? Or perhaps it is prediction that enhance curiosity?
Prediction and curiosity – chicken and egg?
And last -
Can we induce prediction and curiosity? Should we? Or perhaps we should “just tell them”?
Questions about predictions and curiosity are not new and have raised the curiosity of both scientists and educators, and yes, we do know that prediction-based activities can promote learning and that curiosity promotes exploration and learning. The goal here is not to cover the entire field of research, but to focus on an array of studies from cognitive neuroscience – across the levels of experimental investigation, and mostly findings from human behavior experiments that specifically target declarative learning. They paint an interesting and promising picture regarding the function of prediction in learning,,.
In what follows we discuss evidence and ideas regarding the following questions:
Why prediction is interesting? and how it is related to meaning making?
What do we know about the role of prediction in acquiring declarative knowledge?
What are the relations among prediction, curiosity and learning?
Why and how should we use prediction in the classroom?
I have a special interest in the following question “what is meaning, and how we can define it for educational purposes?” I think meaning has a few qualities that make it worth this attention:
Meaning is essential for better and more substantial learning.
It is often difficult to achieve, and requires mental resources of attention, focus and processing.
Hence, we can say that meaning making is a desirable difficulty, a term coined by R, Bjork for effective practice methods like retrieval practice and distributed practice. However, it is important to note that at the core, it is the meaning that these strategies help us shape: we test which links survive the passage of time and change of context – and we FIX those links that didn’t. In a sense, Meaning-making is the queen of the desirable difficulties. Making meaning IS that thing that makes effective practice more difficult (read more).
But we also struggle with meaning because it is elusive – we don’t really know what meaning is, we cannot see it - and I often hear people define meaning by what it is NOT: it’s not rote memorization, it is not isolated pieces of knowledge, and so on…
Hence it is useful to try and define meaning -
It is helpful to think about meaning using a concrete model of building a pyramid: the small lighter triangular blocks represent pieces of knowledge that we already have. We can add a new (darker) piece of new information in a very specific placement, to build a pyramid. The pyramid is a structure of well-organized pieces of knowledge, that serves a function by itself – it is a structure that we can use to build even bigger more elaborated pyramids. Finally, the same concept may have different levels of meaning in different contexts (for example: the concept “equal” in math in social justice - read more here)
We can further use this model to describe the entire learning process:
In order to construct a body of knowledge, we need to make sure that we have the relevant prior knowledge – the base level. Next, we should be able to focus on and process the new incoming piece of information, and very importantly, CONNECT them together in a very specific way, that creates a new functional structure: the meaning-making stage. With time and practice we consolidate this new structure and can use it easily for further building.
This model helps highlighting the prerequisites for learning, and thinking about the individual stages. It also helps in realizing where prediction comes into the picture:
As described above, meaning making is essential yet elusive, and the most “essential yet elusive” part is this one: how do we make sure the placement is accurate? After all, this happens within the “black box”, the learner's mind, where we have no access.
Prediction is using our prior knowledge in order to expect or predict the position of a newly introduced piece.
For example: how does a flower turn into a fruit? – the pieces are not necessarily new (flower and fruit), the connection is new, and it’s all about this connection!
to visualize this process: we retrieve the existing knowledge and then use it to predict a possible connection. Next, this prediction can be either confirmed (top), or we can be surprise by an unexpected explanation (bottom).
First, there is sweeping evidence that prior knowledge is absolutely essential for learning (e.g. 3).
Second, there is evidence for the existence of two pathways :
New knowledge that is consistent with existing schemas or mental models is more easily learned and better remembered (4). E.g It will be easy for you to learn that a tomato is part of my vegetable salad.
Novel information that is unexpected or violating existing models, is also more easily learned and remembered. For example, we were all surprised at one point or another to learn that a tomato is actually a fruit, and you probably remember this fact since.
In both cases, events or items are remembered better than unrelated information. For example, if a tomato was just another item on a random shopping list.
Interestingly, we also know that these two options are potentially processed via different complementary neural pathways, with one pathway responsible for matching to existing schemas, and another that integrates components in novel ways (this would be the hippocampus, that you may have heard of) (5).
In this tomato example a “known” item may surprise us, if it is connected to prior knowledge in an unexpected way, this is where prediction has value – this mismatch may promote learning. And by exploring prediction we are focusing on the most elusive part of making meaning. We are going to focus on this aspect of learning specifically: the role of prediction in updating memories: creating new connections, reorganizing or updating our knowledge base.
It's noteworthy to consider the neurobiology of updating and reorganizing knowledge, because the story of prediction goes all the way from there: in order to update knowledge, existing connections in the brain are re-wired. This process at the biological level is called REconsolidation.
Consolidation is the initial biological process of forming connections after learning.
Reconsolidation - when already consolidated memories become malleable upon retrieval and depend, again, on a biological process for rewiring.
This is pretty exciting finding, right? Probably one probable mechanism underlying the benefits of retrieval practice– we retrieve and reactivate, check relevancy in current situation, which leads to updating by re-wiring.
At the same time, there is evidence that reconsolidation doesn’t happen every time a trace is activated– and in a fascinating review by Lee (6) of findings from research at the level of the molecules, neurons, and behaving animals.
Lee (2009) suggests that reconsolidation is a mechanism that mediates memory updating specifically, and as such, it is triggered by the following processes:
Activation – the old memories must be active prior to the update
And
Prediction error signal - when there is a mismatch between expected and actual events during reactivation.
Even more intriguing is that we now have some evidence from cognitive neuroscience, using human behavior experiments and focusing on declarative learning that support this direction.
What follows is a review of selected research evidence about updating memories in human declarative learning.
The research focuses on updating memories, hence, the experimental designs follow this general pattern: First, the participants learn something, then they have an opportunity to retrieve and update what they have learned, and finally, they are tested to find out if the memories were updated and under what experimental conditions. There sessions are commonly days apart, and hence we are definitely in the realm of long-term memory.
The first study (7) that taps into memory activation: two groups learned a list of 20 real concrete and visible items (laid out on a trey, represented by the grey rectangles), on day one. Two days later they learned another list of new items (red rectangles). On this occasion, the experimental group was reminded of the first learning experience (although not in detail) prior to learning the second list, while the other group was not. Finally, two additional days later, the two groups were asked to recall the items for the first list (grey items).
They find no difference in the number of items remembered (grey), however interestingly, the "reminder" group remembered more items from the second list (red) as if they were part of the first.
That is – their older memory trace of list 1 was (falsely) updated only when it was activated prior to the new learning.
This evidence highlight how activating what we already remember is required for updating our knowledge with novel information. The implication is that having the prior knowledge is not enough, it should also be reactivated at the right time.
We get a glimpse here to the mechanism itself, even when the stimuli are discrete and practically meaningless. Intriguingly, additional research (8) from the same group showed similar results in kids, and further research conducted by another group (9) used short meaningful videoclips as stimuli, replicated the results and demonstrated the added value of surprise to the updating process - memories are updated when they are activated and when predictions are violated - read more about it here.
Another recent study (10) explores further the effect of surprise or more specifically prediction error. Here, participants studied “school compatible” material: six page-long text describing an unfamiliar historical event that took them 35-40 minutes to read and learn.
Two days later, they participated in a test consisting of 100 open-ended questions. They answered the question, then rated their confidence, and finally were received feedback in the form of the correct answer. Nine days later they were tested again to find out if the feedback lead to updating their memories
In order to test for the effect of prediction error, the researchers selected the incorrectly answered questions and classified them into "incorrect with high confidence rate" to represent larger prediction error (top), and "incorrect with lower confidence ratings" to represent smaller prediction error (bottom).
Can you predict the results? Which condition lead to more memory updating?
perhaps you have guessed that wrong prediction is actually preferable…. because items with larger prediction error, incorrect with high confidence rating, were more likely to be updated than those with the smaller gap, or prediction error.
Moreover, in a complementary functional imaging experiment, where participants underwent a similar procedure, but the second session took place within the fMRI machine – the feedback, i.e the correct answer, elicited an activation of the reward system, similar to non-declarative types of reinforcement learning. This indicates an involvement of the reward system in this type of learning.
To summarize, we see that existing declarative knowledge is updated with new information when:
a) The prior knowledge is activated before learning
And -
b) When the prior knowledge elicits a prediction error, or violation of expectationץ
So along similar lines to the findings from animal models with simpler learning forms, when human declarative learning is involved: activation and prediction-error promote updating of existing knowledge with new information!
In the experiments we have discussed before, prediction was an implicit process, inherent to retrieving prior knowledge – we retrieve information and then compare it to “reality” to assess the gap between them.
This is an indication to why this mechanism may play a role in updating memories when using retrieval practice for example, however, it is interesting to ask HOW prediction error promotes learning, and introduce questions about the role of curiosity.
Does realizing the prediction error make people more curious which triggers learning? or perhaps innately curious individuals tend to make more predictions and they benefit more from the new information? what comes first, curiosity or predictions?
Curiosity is intriguing piece of this puzzle: it is known to be "the wick in the candle of learning" - and indeed curiosity was found to be related to exploration and learning, and to the activation of the reward system.
It has been suggested that state curiosity is a motivational state that stimulates exploration and information seeking to reduce uncertainty, where the information itself may act as a reward (11).
Recently several studies in cognitive neuroscience were after the mechanisms of curiosity (12) and I hope you are curious to find out one last bit about the role of curiosity in updating memories.
In this study (13), the researchers directly assessed the relation between making predictions and developing curiosity.
Participants were exposed to 90 numerical trivia facts with a missing piece of information, Then, they either predicted the missing number (top) or generated an example (bottom). Note that this is a tight control, control group was still asked to think about the fact, but not to predict the target number. Then, they rated their curiosity and waited a few seconds in anticipation for the correct answer. The study also included pupil dilation at the indicated time points as a physiological measure for curiosity.
They find that more facts in the prediction condition receive high curiosity rating in comparison with the generate example condition, and that pupil dilation is greater in this condition as well.
These results suggest that generating prediction stimulates curiosity.
[Interestingly, a recent study (14) using the same paradigm shows that prediction induces better (short-term) memory for the facts in children, but not in young adults, indicating that prediction has a special importance for younger minds.]
This kind of findings, along with a series of studies in cognitive psychology and cognitive neuroscience, including physiological and neuroimaging findings, have led researchers to develop a compelling framework of the role of curiosity in learning, including the assumed role of the related brain regions (12).
They suggest that identification of prediction error, a gap or a conflict with what is already known, leads to an appraisal process, estimating the subjective size of the gap, which can lead either to a state of curiosity , or if the gap is too big to a state of anxiety, which exits the learning loop.
Curiosity then stimulates learning via the reward system, which its input enhances attention to, encoding and consolidation of new information. This in turn may trigger another cycle.
This is a suggested model, based on some convincing data, but still a model. However it leads to very interesting predictions – and further research that is relevant to educators as well. One promising and practical question is whether by triggering prediction we can stimulate curiosity, which works through innate mechanisms of reward to enhance the encoding and consolidation of new information via the neural system that some of you know as the hippocampus.
This idea of inducing prediction not alien to findings from real classrooms, and we conclude with one such study exploring prediction in the context of classroom demonstrations. This is a very interesting topic, because everybody likes demonstrations, but how can we be sure learners are actually learning, not just having fun “around learning”?
A study with college premedical students (15), in intro physics course - makes just this very point: They compared the following four conditions of teaching with demonstrations:
No demonstration (baseline) – topic introduced with explanation only.
Observation - a demonstration was followed by an explanation.
Prediction - students record their prediction of the outcome, observe the demonstration , followed by an explanation
Prediction & discussion - students record prediction, observe the demonstration, engage in a peer discussion, and finally listen to the explanation.
To measure learning - the researchers conducted a test, assessing the knowledge from the demonstrations specifically, asking for an answer regarding the outcome and asking for an explanation. Only perfectly correct answers were considered as correct.
It is clear that demonstration alone did not contribute much to explanation, but that investing two additional minutes in an individual prediction task increased both retention and explanation level. Whether or not it is worth investing extra time in peer discussion is also a matter of how much time you have.
Sometimes we may wonder if we should demonstrate and then explain to catch learners attention, or rather explain first and then demonstrate to induce better understanding. Perhaps we can think of prediction as a guideline: explain whatever is needed to induce a prediction regarding the outcome, then demonstrate and explain, highlighting the expected mismatch.
In yet another study performed with classroom compatible materials(17), we see that trying to come up with an answer before searching information via Google is more effective than immediately turning to Google search. The findings suggest (inconclusively) that the effect is more prominent for learners with more relevant prior knowledge. This finding is inline with the ideas around prediction presented above. And again, prediction may serve as a guideline to using methods like pre-testing: it works, or works better when the pretesting allows for meaningful prediction (see more here).
Another beautiful example that highlight this point comes from Micahel Walsh for the use of prediction in teaching English - how prediction helps students build new structure of knowledge on the basis of their existing ones when studying Shakespeare’s sonnet.
Let's come back to the pyramid model to summarize the conditions to learning, and the role of prediction as we explored it. We saw that prediction is especially relevant for cases where there is a mismatch between expectation and reality, and when memories should be updated. As we said in the beginning, prediction plays a role in several types of learning, and was especially investigated in simpler types. However, we focused on declarative learning – facts and events, and hence the possible role of prediction in making meaning.
So what are the elements of the process?
First, it is widely accepted that prior knowledge is required for learning.
However, as we saw, prior knowledge alone is insufficient and it must also be deliberately activated, for the chance of being updated.
The next step in a traditional model would be to introduce the new information – and especially a new way to connect the pieces together – to make meaning, to build a new pyramid. However, we have highlighted the possible benefit of breaking this essential, yet elusive, process into its elements and harnessing the benefit of prediction: by inducing all learners to predict the nature of the connection, we support a process in which they notice and appreciate a possible mismatch, which induces targeted curiosity that increases their attention to and the encoding of the new unexpected information, also enhancing the consolidation into long-term memory.
We can try and think about "inducing prediction" as "a targeted and structured discovery learning." Advocates of discovery learning emphasize the benefit of finding out for yourself, opponents would say we cannot be sure what exactly students are thinking about, and since this is absolutely crucial (memory is the residue of thought, right?) teachers better explain properly.
With asking everyone to predict we can harness the good in both, call it targeted-discovery: set the stage, point learners in the direction of a possible mismatch, and let the learners, all the learners, benefit from predicting, getting curious, and being rewarded with the new meaning.
Prediction is in an intriguing topic because it seems to be a fundamental element of how we learn, and it is involved in basic as well as higher forms of learning, like those we use and develop in classrooms. This is one of the topics where I see future bridges built among cognitive neuroscience, cognitive psychology and education. Prediction is not a specific strategy, but perhaps, like retrieval, it's a cognitive process that is worth being aware of if you teach, and identify the right spots where it can support students' learning.
While thinking about prediction, many classroom related questions came to mind, some of them are below - I'd be happy if you could share some of yours, and keep this conversation going.
Is prediction practically guessing?
No, guessing is not focused on a mismatch, it’s not about raising random options. Prediction is when we have enough prior knowledge and a missing link which we focus on. So definitely not the same thing – based on what we learned, I predict that guessing will not have similar effects on curiosity and learning.
What is the value of deliberately inducing prediction? Isn’t it natural?
Yes, prediction is based on a natural process, but as always, the challenge with classroom-taught kind of knowledge is that it is not natural and links are not always obvious. Some children may use prediction implicitly and remember more. We may label them as innately curious. But as we saw, we can encourage everybody into a state of curiosity, by guiding their predictions.
Moreover, there are many other easier and “more natural” predictions to make in the classroom (for example: "I predict that the teacher won’t notice this paper ball flying toward the basket"), so like many other effective methods – by inducing focused predictions we are supporting everyone in building on their innate abilities and directing the effort to what we consider to be important.
How prediction relates to desirable difficulties?
Desirable difficulties are effective practice methods like retrieval practice and distributed practice(2). The idea is that these strategies are effective when they are difficult. We can ask where does the difficulty stems from, and one good answer is that it stems from the necessity to update meaningful connections. When we try to retrieve something that we have learned in the past, in a new context or using different kinds of cues – we may get stuck, our existing mental model doesn’t fully work and we find out we need to update it. That is, to create or reconstruct a new link where one is missing. In that sense of updating, prediction is probably very relevant: we retrieve incorrectly or inaccurately, this is a prediction error, and by figuring out the right answer we gain from the new learning. I hope you can see how these lines of research may converge and bring us relevant insights for teaching.
References
Clark, A. (2013). Whatever next? Predictive brains, situated agents, and the future of cognitive science. Behavioral and brain sciences, 36(3), 181-204.
Bjork, E. L., & Bjork, R. A. (2011). Making things hard on yourself, but in a good way: Creating desirable difficulties to enhance learning. Psychology and the real world: Essays illustrating fundamental contributions to society, 2(59-68).
Shing, Y. L., & Brod, G. (2016). Effects of prior knowledge on memory: Implications for education. Mind, Brain, and Education, 10(3), 153-161.
Gilboa, A., & Marlatte, H. (2017). Neurobiology of schemas and schema-mediated memory. Trends in cognitive sciences, 21(8), 618-631.
Van Kesteren, M. T. R., & Meeter, M. (2020). How to optimize knowledge construction in the brain. npj Science of Learning, 5(1), 1-7.
Lee, J. L. (2009). Reconsolidation: maintaining memory relevance. Trends in neurosciences, 32(8), 413-420.
Hupbach, Gomez, Hardt, Nadel, Learning & Memory, 2007
FHupbach, A., Gomez, R., & Nadel, L. (2011). Episodic memory updating: The role of context familiarity. Psychonomic bulletin & review, 18(4), 787-797.
Sinclair, A. H., & Barense, M. D. (2018). Surprise and destabilize: prediction error influences episodic memory reconsolidation. Learning & Memory, 25(8), 369-381.
Pine, A., Sadeh, N., Ben-Yakov, A., Dudai, Y., & Mendelsohn, A. (2018). Knowledge acquisition is governed by striatal prediction errors. Nature communications, 9(1), 1-14.
Kang, M. J., Hsu, M., Krajbich, I. M., Loewenstein, G., McClure, S. M., Wang, J. T. Y., & Camerer, C. F. (2009). The wick in the candle of learning: Epistemic curiosity activates reward circuitry and enhances memory. Psychological science, 20(8), 963-973.
Gruber, M. J., & Ranganath, C. (2019). How curiosity enhances hippocampus-dependent Memory: The prediction, appraisal, curiosity, and exploration (PACE) framework. Trends in cognitive sciences.
Brod, G., & Breitwieser, J. (2019). Lighting the wick in the candle of learning: generating a prediction stimulates curiosity. NPJ science of learning, 4(1), 1-7.
Breitwieser, J., & Brod, G. (2020). Cognitive prerequisites for generative learning: why some learning strategies are more effective than others. Child Development.
Crouch, C., Fagen, A. P., Callan, J. P., & Mazur, E. (2004). Classroom demonstrations: Learning tools or entertainment?. American journal of physics, 72(6), 835-838.
Heath, C., & Heath, D. (2007). Made to stick: Why some ideas survive and others die. Random House.
Giebl, S., Mena, S., Storm, B. C., Bjork, E. L., & Bjork, R. A. (2020). Answer First or Google First? Using the Internet in ways that Enhance, not Impair, One’s Subsequent Retention of Needed Information. Psychology Learning & Teaching,
Blog:
18. Structural prior knowledge and the power of prediction by Michael Walsh