technically
learning

A Primer on Learning Theories

The following primer is intended to give a brief overview of learning theories that are influential in educational practice, education research, and educational technology. Existing resources appear to be either too long (e.g., education or psychology textbooks) or too short (e.g., blog posts on individual learning theories). There seems to be a dearth of pedagogical material in the middle, namely something that (1) can be read in a few hours, (2) covers the set of prominent learning theories discussed here in historical context, and (3) provides enough detail so that a student or newcomer to this field can begin to see the contours of the complex landscape of learning theories in education. This is my attempt to provide such a resource, first of all, for my own students, and second, for others who may find it useful.

I also found that other resources tend to oversimplify or misrepresent ideas. Here, I try to describe the nuanced views of the learning theorists as objectively as I could, without resorting to strawman arguments or misconstrued versions of the theories. But alas, with this amount of breadth, it's very easy for me to still be misrepresenting some ideas.

If you assign this primer in a course or otherwise find it helpful, I would appreciate it if you'd let me know! Please also let me know if you have any feedback or suggestions for improvement.

A PDF version (which can be cited) is available here.

What does it mean to learn? How does learning happen? How do we know when someone has learned something? And how do we use these insights to help people learn better–by improving teaching practices or creating technologies that help people learn? While many people, including philosophers and lay people, have pondered on these questions for centuries, they became more prominent with the advent of psychology, a field that can give us scientific insights into how people learn. A variety of learning theories have been proposed from the late nineteenth century (when the field of experimental psychology first formed) to the present day.

A learning theory tries to explain how people learn, usually based on an accumulation of scientific evidence. (Technically, we could also develop theories about how animals or machines learn as well, and indeed these informed certain learning theories, as we discuss below.) Scientific experiments that study learning usually give us insights on how someone learns a specific thing in a specific time and place (e.g., how does a student learn when solving math problems in a classroom, or how does a person learn when given a puzzle to solve in a lab?). In reality, learning is a rich and complex phenomenon, and most, if not all, scientists would agree that there is no single kind of learning that can describe all the ways all people learn in all situations. Thus, in practice, what are often called learning theories are really worldviews for how to think about learning. These worldviews will often be influenced by some combination of scientific evidence, philosophical ideas about what it means to know and learn, values that determine what kinds of learning are valuable, and intuitions about how people learn. So when we discuss learning theories here, we are really talking about such worldviews, as reflected by the work of some of the key theorists and scientists who helped shape these theories. As you read the sections that follow, it may be useful to think about to what extent various theories were informed by science, philosophy, and intuitions about how people learn.

In this primer, we will examine four broad learning theories that have been especially important in education, and especially relevant to the design of educational technologies. These theories are: behaviorism, cognitivism, constructivism, socio-cultural theories (including situativism).

In practice, what are often called learning theories are really worldviews for how to think about learning. These worldviews will often be influenced by some combination of scientific evidence, philosophical ideas about what it means to know and learn, values that determine what kinds of learning are valuable, and intuitions about how people learn.

Behaviorism

Behaviorists generally see the mind as a black box. Behaviorists would either claim that (a) we don’t know what happens inside someone’s head—all we know is what they perceive through their senses (stimuli) and how they behave (responses)—or (b) even if we do know what happens in someone’s head, changes in the head are ultimately reflected by changes in behavior, so we might as well study behavior; see Figure 1. Therefore, in order to scientifically study human action and learning, behaviorists committed themselves to study human behavior. Moreover, in terms of behavior, humans and animals are similar in many ways, and so studying animals can give insights about humans as well. These claims might seem odd to us now that psychology, neuroscience, and other areas have developed sophisticated ways to study the mind and cognition. But when behaviorism first started forming in the late nineteenth century, it was reacting to dominant techniques in psychology at the time, like Freud’s psychoanalysis and parapsychology (i.e., studying “psychic” phenomena like telepathy and hypnosis), which were often seen as pseudoscientific.

Figure 1: Behaviorism sees the mind as a black box, and simply observes the relationship between stimuli and responses. In some forms of behaviorism, whatever happens in the mind is seen as irrelevant to human behavior, and impossible to study.

Figure 1: Behaviorism sees the mind as a black box, and simply observes the relationship between stimuli and responses. In some forms of behaviorism, whatever happens in the mind is seen as irrelevant to human behavior, and impossible to study.

Two of the early pioneering researchers who had a great influence on what is now called behaviorism were Ivan Pavlov (1849-1936) and Edward Thorndike (1874-1949). Pavlov was a Russian physiologist, who is most well known in the field of psychology for his discovery of classical conditioning in experiments with dogs (“Pavlov’s dog”), which he published in 1897. When a dog sees food, a natural physiological response is to salivate. Salivation is an unconditioned response (UR) to food, an unconditioned stimulus (US). If the food is accompanied with another stimulus, say ringing a bell, the dog will eventually begin to salivate upon hearing the bell. Thus, the dog now gives a conditioned response (CR; salivation) to a conditioned stimulus (CS; bell). But if the bell is repeatedly rung with no food, the CR will diminish over time (like The Boy Who Cried Wolf). Pavlov won the Nobel Prize in Physiology in 1904. Although he saw himself as a physiologist and not a psychologist, his work became extremely influential in psychology and in establishing behaviorism. Indeed, Pavlov’s work tied a connection between the scientific study of physiology (namely, the instinctive behavior of animals) and (animal) psychology. As Pavlov (1904) noted in his Nobel Prize speech:


Since we used the studies of the lowly organized representatives of the animal kingdom as an example, and, naturally, wanted to remain physiologists instead of becoming psychologists, we decided to take an entirely objective point of view also towards the psychical phenomena in our experiments with animals.


In 1898, around the same time that Pavlov published his work introducing classical conditioning, Thorndike published his own work on animal experiments, introducing what he later called “the law of effect.” The law of effect states that the likelihood of a behavior will increase if it is associated with a satisfying result, and the likelihood of a behavior will decrease if it is associated with an unsatisfying result. While Pavlov was known for his dogs, Thorndike worked with a variety of animals to demonstrate the law of effect, perhaps most notably, cats. He showed that if a hungry cat is put into a cage with some escape mechanism, it will eventually figure out how to escape to receive food by trial-and-error. If the cat is repeatedly put into the same cage, it will learn to escape more quickly over time. Thorndike later showed similar results with humans that could be of educational value. For example, he showed that when subjects were given feedback on whether they were right or wrong on a series of tasks (e.g., estimating the lengths of a series of strips of papers), they improved over time. The right or wrong feedback in this case has a similar effect to the food given in the cat experiments. This was early scientific evidence for the importance of giving learners immediate feedback, which has become an important “best practice” used in a lot of educational technology. It is important to note that Thorndike did not self-identify as a behaviorist; indeed he explained the law of effect in terms of changes happening among neurons. Nonetheless, his theory can be essentially understood as a behaviorist theory, and laid the foundations of behaviorism.


Behaviorism became an established field in psychology with the work of John B. Watson (1878-1958). Watson’s influential variant of behaviorism was called methodological behaviorism, which claimed that since thoughts, feelings, and other “private events” of the mind cannot be objectively observed by the scientists, psychology should focus only on behaviors (i.e., “public events”; Day, 1983; Graham, 2019).. Watson was perhaps most well known for his controversial “little Albert” experiment, where he showed that fear could be classically conditioned in humans. When presented with loud noises (US), babies show fear (UR). By presenting loud noises with rats (CS), little Albert started showing fear (CR) when only presented with rats, which he did not previously fear. The poor kid also began to fear other furry creatures and even a Santa Claus beard.


Perhaps the most well known behaviorist, especially in the world of education, is B. F. Skinner (1904-1990). Beginning in the 1930s, Skinner advanced Thorndike’s “law of effect” into what is now called operant conditioning. Under Skinner’s formulation, one can “reinforce” a desired behavior in response to a stimulus by giving a “positive reinforcement” (e.g., food, reward, positive feedback) whenever the subject (whether human or animal) displays that behavior in response to the stimulus. Similarly, one can get the subject to stop exhibiting an undesirable behavior in response to a stimulus, by giving a punishment whenever the subject displays that behavior in response to the stimulus.


Skinner did many of his experiments with rats and pigeons, and he showed that by reinforcing certain behaviors with food, the animals can be trained to exhibit highly complex behaviors. For example, Skinner showed that pigeons could play a version of “ping pong” (with pecks, not paddles) with one another, where they get food if the ball went past their opponent. During World War II, he also formed Project Pigeon, where he demonstrated how pigeons could be used to guide bombs toward a target, but this project never actually made it to fruition. Skinner further showed that one could chain a long series of stimulus-response pairs; once the subject gives the desired response, it acts as a stimulus for a new response, and so on, with only the last response being reinforced. Skinner (1965) gives a nice example of how to teach pigeons to do figure eight motions:


Suppose, for example, it is decided that the pigeon is to pace a figure eight. The demonstrator cannot simply wait for this response to occur and then reinforce it. Instead he reinforces any current response which may contribute to the final pattern—possibly simply turning the head or taking a step in, say, a clockwise direction. The reinforced response will quickly be repeated (one can actually see learning take place under these circumstances), and reinforcement is then withheld until a more marked movement in the same direction is made. Eventually only a complete turn is reinforced. Similar responses in a counterclockwise direction are then strengthened, the clockwise movement suffering partial extinction. When a complete counterclockwise movement has thus been shaped, the clockwise turn is reinstated, and eventually the pigeon makes both turns in succession and is reinforced. The whole pattern is then quickly repeated, QED. The process of shaping a response of this complexity should take no more than five or ten minutes. (p. 430-431)

Skinner’s variant of behaviorism was known as radical behaviorism. It differed from Watson’s methodological behaviorism in that Skinner did not think private events were completely irrelevant; rather, he thought of private events as behaviors themselves, which could be reinforced or which could act as stimuli for other private or public events (Day, 1983), as depicted in Figure 2. This might sound a bit abstract. To understand this more concretely we can look at an educationally-relevant example that Skinner (1965) provides:


What happens when a student memorizes a poem? Let us say that he begins by reading the poem from a text. His behavior is at that time under the control of the text, and it is to be accounted for by examining the process through which he has learned to read. When he eventually speaks the poem in the absence of a text, the same form of verbal behavior has come under the control of other stimuli. He may begin to recite when asked to do so—he is then under control of an external verbal stimulus—but, as he continues to recite, his behavior comes under the control of stimuli he himself is generating (not necessarily in a crude word-by-word chaining of responses). In the process of ‘memorizing’ the poem, control passes from one kind of stimulus to another. (p. 437)


When Skinner speaks of control “pass[ing] from one kind of stimulus to another,” he means that initially the written poem was the stimulus, and over time, the words of the poem (perhaps as imagined in the students’ mind) became the stimulus. The student “remembers” the poem by using words, rhymes, or the meaning of the poem as internal stimuli that cue the next word he needs to recite. Notice that Skinner does not speak of what internal representations the student is using to remember the poem (words?, lines?, images of the words?, sounds of the words?); these are private events that might vary from student to student. The important thing is that the student can still be conditioned using an external stimulus to start (the whole poem) that eventually (through a series of private stimuli) evokes a public response (reciting the poem). The goal of radical behaviorism is not to speculate about what these private events are, but rather to manipulate them as behaviors.

Ivan Pavlov is known for classical conditioning.












Edward Thorndike is known for the law of effect.






immediate feedback is motivated by by the law of effect



John B. Watson pioneered behaviorism as a field via methodological behaviorism.



B. F. Skinner is known for operant conditioning.




















radical behaviorism is Skinner's variant of behaviorism, which acknowledges the relevance of private events

Figure 2: Radical behaviorism still treats the mind as a black box, but acknowledges the existence of private events and treats them as behaviors that can be manipulated by presenting stimuli, even though they cannot be observed.

This example also serves as a segue to Skinner’s educational thought. In the 1950s-1960s, Skinner developed ideas around how to teach students using his theory of operant conditioning. Indeed, after presenting the example above, Skinner (1965) gave an illustration of how a teacher can teach students to learn a poem by successively presenting the poem with more and more letters missing. The letters that remain can act as stimuli for the missing letters, which can act as stimuli for other missing letters, etc. Basically, Skinner believed practically anything could and should be taught using the appropriate sequence of stimuli and reinforcements. This turned into the concept of programmed instruction, where a student would be “programmed” by answering a series of successive questions with immediate feedback. Skinner developed mechanical teaching machines that could teach students using such programs.


Behaviorism lost popularity after the 1960s, at least in part due to what has been called the “cognitive revolution.” Many now object to various aspects of behaviorism. First, behaviorists’ focus on animals makes it seem as though behaviorists view teaching human learners as identical to training animals. Second, the little Albert experiment was highly controversial, but Skinner also critiqued that experiment (despite its scientific significance). Third, many might find the use of punishment objectionable. But it is important to note that Skinner did not advocate for punishments in education. He saw reinforcement as being more effective in the long run (and perhaps more ethical). Finally, many critique behaviorism because of its disregard for the mind and consciousness. By reducing people to animals motivated by rewards and punishments, it did away with what many think education is all about: intellectual pursuit, critical thinking discovery, and creativity. Interestingly, Skinner actually believed behaviorist principles could be used to teach these things. To Skinner, thinking was yet another behavior that should be reinforced.


The cognitive revolution was a way to move away from these unpopular aspects of behaviorism. But it is important to note that behaviorism was not entirely replaced by cognitivism. Indeed, behaviorism still exists today in certain subfields of psychology and psychotherapy (such as applied behavior analysis). But it has lost explicit popularity in education. Nonetheless, many aspects of behaviorism still impact educational practice, and behaviorist principles can still be found in the design of educational technology. Indeed, cognitivists actually borrowed and built off of behaviorism.




programmed instruction and teaching machines were developed by Skinner as applications of behaviorism to education

Cognitivism

Cognitivists opened the black box that behaviorists preferred to leave closed. If behaviorists approached human intelligence by studying animal intelligence, cognitivists learned about human intelligence by studying artificial intelligence. The field of cognitive science emerged in the late 1950s as an alternative to behaviorism. The year 1956 is widely regarded as the birth of cognitive science, or the “cognitive revolution” (Gardner, 1987; Simon, 1980). Two seminal events took place that year:


  1. The “Dartmouth Summer Research Project in Artificial Intelligence” was held at Dartmouth College. In the proposal for this workshop, John McCarthy who organized it stated that “The study is to proceed on the conjecture that every aspect of learning or any other feature of intelligence can be so precisely described that a machine can be made to simulate it.” McCarthy actually coined the term “artificial intelligence” (AI) in the proposal. He and others invited a group of researchers who were interested in machine intelligence. Among the participants were Herbert Simon and Allen Newell, who presented the Logic Theorist, a computer program that could prove mathematical theorems: a hallmark of human intelligence. (Can Skinner’s pigeons do that?)

  2. The Symposium on Information Theory was held at the Massachusetts Institute of Technology in September, 1956. George Miller (2003) claims September 11, 1956—the second day of the three day symposium—was the day cognitive science was born. Several papers were presented that day that approached cognitive science from different fields.

  • Simon and Newell presented their work on the Logic Theorist that was also presented at the Dartmouth workshop.

  • Noam Chomsky, the now famous linguist and political activist, presented his work on a theory of grammar, whereby a few rules in the human mind can be used to form complex grammatical sentences. Chomsky also used this to separate syntax (grammar) from semantics (meaning). His theory was published in a book called Syntactic Structures the following year. In 1959, Chomsky published a review of Skinner’s book Verbal Behavior that probably influenced the shift from behaviorism to cognitive science.

  • Miller himself presented his work published in a paper called “The Magical Number Seven, Plus or Minus Two: Some Limits on Our Capacity for Processing Information.” This paper compiled results from previous studies to show that humans have a short-term memory capacity of about seven “chunks” at a time. This article defined the concept of chunking, whereby people can group related concepts to store them as single entities (or chunks) in memory.


What stands out at these two events, was the role that computer science, artificial intelligence, and information theory played in the development of cognitive science. Indeed, the “mind as computer” became a powerful metaphor in cognitive science, and the central metaphor in a sub-discipline of cognitive science: information-processing psychology, or what is often simply called “cognitivism” in education. Figure 3 shows a simplified representation of how cognitivists represented the mind as a computer with different kinds of “cognitive architectures,” which consist of various interlocking components that can account for memory, problem solving, learning, etc.
























chunking is a central mechanism of memory storage in cognitive theories

Figure 3: Cognitivism opens up the black box of the mind and treats it as a computer. Various cognitive architectures for the mind have been proposed that decompose the mind into different modules that processes that work together, which could include long-term memory, working memory, chunking, learning, and production systems.

Cognitivism or information-processing psychology grew most prominently out of the work of Herbert Simon (1916-2001), Allen Newell (1927-1992), and their colleagues at Carnegie Mellon University. As mentioned earlier, Simon and Newell were two of the pioneers of artificial intelligence, and created one of the first AI programs, the Logic Theorist. This work led them on decades of research that simultaneously investigated how people think and solve problems and how machines could think. They did this by studying people completing problem-solving tasks in lab studies, and then using the data collected from people to create computer programs that could solve the same problems. Thus, their work simultaneously informed the fields of cognitive psychology and AI. The psychological implications of their work naturally extended to the field of education.


While early work was focused on how people think, cognitivism later developed theories on how people learn. Learning involves the acquisition of knowledge, often described by a production system, a set of rules that a person must acquire in order to successfully complete a task (Klahr, Langley, & Neches, 1987; Newell, 1973). Productions apply when certain conditions on elements in working memory are satisfied (e.g., there are two numbers with a plus sign between them followed by an equals sign). When a production is applied, it results in specific changes to elements in working memory (e.g., add the numbers together and write the resulting number after the equals sign). Information-processing theories have proposed various mechanisms for creating new production rules, modifying or refining existing rules, and applying rules in the appropriate situations.


Cognitivism has been highly influential in the design of educational technologies. Indeed, it inspired decades of work on intelligent tutoring systems, including the cognitive tutors developed by Simon and Newell’s colleagues at Carnegie Mellon. These tutoring systems are encoded with a set of rules (i.e., a production system) for how to solve a problem; since the system knows how to solve the problem, it can guide students in a step-by-step fashion to acquire the rules and apply them at the right time.


Cognitivists also developed theories about the importance of the cognitive load that a task places on a person’s working memory capacity (Sweller, 1988). According to cognitive load theory, one should design instructional materials to minimize extraneous cognitive load. As a very simple example, people are not very good at solving multiplication problems with big numbers in their head, because they cannot store all the numbers involved in their working memory to execute the task; if they have some scratch paper to use however, they can effectively solve the problem. Technologies that help reduce cognitive load can therefore benefit a learner’s ability to learn.

Herbert Simon and Allen Newell are two of the pioneers of information-processing psychology or cognitivism



production systems can be used to explain how we learn rules



intelligent tutoring systems are educational technologies often founded on cognitivist principles


cognitive load affects our ability to successfully complete tasks

Cognitivism ≠ Cognitive Psychology ≠ Cognitive Science

So far, I have not been clear about the meaning of some important terms. While it is fair to refer to behaviorism as behaviorist psychology, cognitivism (at least in the sense used here) is not the same as cognitive psychology; rather, it is one branch of cognitive psychology. I am using the term cognitivism to be identical with information-processing psychology. While the latter term might be more clear, it is a bit long and doesn’t fit the nice pattern of -isms (to contrast it with behaviorism, constructivism, and situativism). Moreover, in the field of education, the term cognitivism is commonly used to refer to information-processing psychology. Cognitive psychology is the branch of psychology that is primarily interested in studying the mind, but not all cognitive psychologists highlight the “mind as computer” metaphor. For example, many constructivists and situativists (discussed below) may also be viewed as cognitive psychologists, but they are not cognitivists as they view the mind differently or disagree with certain aspects of information-processing psychology.

Moreover, cognitive psychology is not the same as cognitive science; rather, it is one branch of cognitive science. Figure 4 shows the relationship between cognitivism, cognitive psychology, and cognitive science. Cognitive science is an interdisciplinary field that brings together researchers interested in studying the mind. It is commonly said that cognitive science draws on six different disciplines: psychology, philosophy, linguistics, anthropology, neuroscience, and artificial intelligence. So some cognitive scientists are predominantly psychologists, and would be likely identified as cognitive psychologists. Some cognitive scientists might be predominantly philosophers and identified as working in the branch of philosophy called the philosophy of the mind. But cognitive science did not just give a name to a group of people from isolated fields who did not communicate with one another. Thus, some cognitive scientists worked at the interface of different fields. As mentioned earlier, cognitivists were working at the intersection of (cognitive) psychology and artificial intelligence.

Figure 4: The relationship between cognitivism, cognitive psychology, and cognitive science. Cognitive psychology is one of six fields that contribute to cognitive science; the other fields are also shown here, but note that these fields are not entirely contained by cognitive science. (For example, not all of anthropology or philosophy would be considered cognitive science!)

Constructivism

In the 1980s and 1990s, constructivism was gaining popularity as a theory within cognitive science that could act as an alternative to cognitivism. Constructivists believe that all knowledge is actively constructed by the learner. They took issue with the fact that cognitivists rigidly viewed the mind as a computer, a metaphor which could not fully account for the diverse experiences that learners have, how learners learn developmentally over time, including misconceptions that form along the way, and the limitations of cognitivism to describe learning in more open-ended tasks. But the roots of constructivism date back much further, so let us trace the origins of constructivist thought.


While the cognitive revolution happened in the 1950s and took off in the decades that followed, some researchers (especially outside the US) had been investigating the development of the mind before cognitive science formally formed as a field. Perhaps the most prominent of these researchers is the Swiss psychologist, Jean Piaget (1896-1980). Piaget is often regarded as a developmental psychologist, but he saw himself as a genetic epistemologist (Papert, 1999). The term genetic epistemology, which Piaget used to refer to his work, may sound a bit confusing at first (indeed, it did to me), so it is worth breaking it down. Epistemology literally means “theory of knowledge” and it is a branch of philosophy that tries to address questions such as “what does it mean to know something?” and “how do we come to know?” The adjective “genetic” has to do with the origins of something (like genetics in biology). By calling the field genetic epistemology, Piaget is bringing attention to the fact that he was interested in the developmental origins of knowledge in children, when they first come to attain knowledge about things.


Piaget is probably most well known for his theory of stages of cognitive development. Through in-depth studies he conducted, beginning with his own children, Piaget found several stages in which a child’s mind develops. But of more relevance to the present discussion, Piaget described how developmental changes happen gradually by discussing how children construct their own knowledge. He posited that when faced with a new piece of information, a person can either go through a process of assimilation or a process of accommodation. Assimilation is where new information is understood in such a way that it can be incorporated into existing knowledge structures, called schemes. In accommodation, the person must change their schemes in light of this new information. For example, when students learn about multiplication for the first time, they might assimilate it into their existing understanding of numbers and addition. At first, perhaps they will mistakenly perform addition instead of multiplication, but ultimately they may come to understand multiplication as repeated addition. On the other hand, when students are first introduced to imaginary numbers, they may need to accommodate by fundamentally changing their understanding of what a number is. Ultimately, the goal is to use assimilation and accommodation to adapt to the world. Constructivism therefore posits learning as a process of constructing knowledge.


According to Ernst von Glasersfeld (1990), Piaget’s constructivism can be distilled to the following points:

  1. Knowledge is not passively received either through the senses or by way of communication. Knowledge is actively built up by the cognizing subject.

  2. a. The function of cognition is adaptive, in the biological sense of the term, tending towards fit or viability;
    b. Cognition serves the subject's organization


Ernst von Glasersfeld (1917-2010) claimed that Piaget’s constructivism was not just a psychological learning theory, but also a philosophical theory. He called his interpretation of Piaget’s constructivism radical constructivism. This can be seen in points 2a and 2b above. According to von Glasersfeld, we cannot know the world as it is—in fact, we can never know if our knowledge is an accurate representation of an objective reality—but rather, our goal is to create knowledge structures that can let us meaningfully function in the world. In this sense, the theory is “radical,” because it dismisses with traditional epistemological views that our knowledge represents reality. As an epistemological theory, constructivism has had wide-ranging implications beyond just learning in children. For example, is the goal of the scientist to discover the laws of the natural world? To radical constructivists, scientists cannot know the world as it is, but can construct realities that are viable, that can help us make sense of the world around us (von Glasersfeld, 2001).


Constructivism can also help explain why students often construct misconceptions. Misconceptions are built based on our prior experiences. Some misconceptions can be hard for students to let go of, because they can find ways to assimilate new information into their schemes. However, when new information clashes with a student’s conception, they will be pushed to modify their schemes to accommodate the new information. Cognitivists might advocate for directly teaching proper understandings to replace misconceptions. On the other hand, constructivists believe that prior conceptions are a route towards more viable understandings. Indeed, misconceptions can often be very useful in many situations (e.g., the idea that “multiplying numbers together makes bigger numbers” is often correct); students just have to refine their conceptions over time to become more widely applicable (Smith III, diSessa, and Roschelle, 1994).


When finishing up his second PhD in mathematics from the University of Cambridge, Seymour Papert (1928-2016) went to study in Geneva with Piaget from 1958-1963. In 1963, he moved to the Massachusetts Institute of Technology to work on the burgeoning field of artificial intelligence with Marvin Minsky, one of the original pioneers of AI who was present at the Dartmouth workshop along with Simon and Newell in 1956. Papert and Minsky had a different approach to studying artificial intelligence compared to Simon and Newell. Papert and Minsky were interested in creating machines that could take inspiration from how children learn and develop (à la Piaget) rather than how adults solve problems. In 1967, Papert created a programming language called Logo to teach children programming, mathematics, and other topics. This became a highly influential educational technology, which we will discuss later in the course. His work with Logo, studies of children, and work in artificial intelligence, were interrelated, and led to the formulation of his own version of constructivism called constructionism. Constructionism agrees with constructivism that people learn by constructing their own knowledge. But what differentiates constructionism from constructivism according to Papert (1987) is that, “Constructionism reminds us that the best way to do that is to build something tangible – something outside your head – that is also personally meaningful.”


Constructivism existed in Piaget’s writings as early as the 1930s, but the philosophical ideas of constructivism have been proposed (and strongly rejected by critics) throughout history. von Glasersfeld (1990) cites the philosophers Giambattista Vico (1668-1744) and Immanuel Kant (1724-1804) as being constructivists. For example, Vico claimed that “The human mind can know only what the human mind has made” (Vico, 1710, cited in von Glasersfeld, 1990). However, these ideas were not just limited to Western philosophers. For example, in a saying attributed to Muhammad al-Baqir (677-733), an important religious leader in Shi’i Islam, “Everything you discern with your imagination, however precise in its meaning, is a created, fashioned thing like yourselves, and it will return back to you.” However, Piaget was possibly the first to study constructivism from a psychological and developmental perspective, making it a scientific theory and not just a philosophical one.


Nonetheless, as mentioned earlier, constructivism only gained in popularity in education beginning in the 1980s, as an alternative to cognitivism. This was through the efforts of von Glasersfeld, Papert, Jerome Bruner (another pioneer of the cognitive sciences) and others. Figure 5 shows how constructivism fits in the broader space of cognitive science. Through its strong focus on epistemology, constructivism links cognitive psychology with philosophy, and through Papert and Minsky’s work on artificial intelligence from a more constructivist perspective, it also links to AI.


Constructivism has been used to advocate for the use of discovery learning or inquiry-based learning in the classroom, whereby the students have to discover concepts and procedures for themselves (under the guidance of the teacher and/or peers), rather than being directly instructed on what they need to know.


It is important to note that while constructivism emphasizes the construction of knowledge, and the role of prior knowledge and experiences in constructing that knowledge, virtually all cognitivists would also agree that knowledge construction must be an active process on behalf of the learner that builds on prior knowledge. However, the constructivist worldview (and epistemology) puts a lot more emphasis on the fact that knowledge is constructed, and because everyone constructs their own realities, it puts more emphasis on individual differences in prior knowledge and how learners learn and develop over time. Cognitivism on the other hand emphasizes how experts (and computer programs) solve tasks, and tries to understand how novices can learn to solve those tasks in the same way. In short, the two theories put emphasis on different aspects of cognition. Shortly after constructivism was gaining popularity among educators, another alternative to cognitivism was emerging that put emphasis on yet another aspect of cognition.








Jean Piaget is recognized as a pioneer of constructivism and genetic epistemology in psychology.






assimilation and accommodation are the two processes that make up constructivist learning










Ernst von Glasersfeld is known for radical constructivism.















Seymour Papert is known for constructionism.























discovery learning and inquiry-based learning are instructional strategies associated with constructivism

It is important to note that while constructivism emphasizes the construction of knowledge, and the role of prior knowledge and experiences in constructing that knowledge, virtually all cognitivists would also agree that knowledge construction must be an active process on behalf of the learner that builds on prior knowledge.

Figure 5: An updated version of Figure 4 showing the relationships between constructivism and situativism and the various fields contributing to cognitive science.

Situativism and Socio-Cultural Theories

Figure 6: The path from Baker Hall (Carnegie Mellon University) to the Learning Research and Development Center (University of Pittsburgh), homes to much of the intellectual activity that led to the development and advancement of learning theories, such as cognitivism (Baker Hall, LRDC) and situativism (LRDC).

Suppose you just stepped outside of Baker Hall, Carnegie Mellon University’s psychology building where Simon and Newell’s offices used to be. You decide to take a twenty minute stroll (see Figure 6). You walk down Frew Street passing by Schenley Park and take a right on Schenley Drive as you gaze at the conservatory to your left. As you walk down Schenley Drive, you realize at some point you have now entered the University of Pittsburgh’s urban campus. Being fond of learning, perhaps you take a moment to visit the Cathedral of Learning, the tallest educational building in the Western Hemisphere (and second tallest in the world). But eventually, you continue your walk, and after a couple more turns you end up at the University of Pittsburgh’s architecturally interesting Learning Research and Development Center (LRDC). While Carnegie Mellon was establishing itself as a pioneer in the newly developing areas of cognitive science and artificial intelligence in the 1960s, the neighboring campus’ LRDC was a pioneer in studying “the interaction between learning research in the behavioral sciences and instructional practice in the schools”.


Depending on the decade in which you visited the LRDC, a different ethos and pervading learning theory would be present. In the 1960s, behaviorism was the prominent learning theory. Beginning in the 1970s, cognitivism was spreading (perhaps from the efforts of Simon, Newell, and other neighbors down the street). In the late 1980s, things began to change again. Like many of her LRDC colleagues, Lauren Resnick, began her career as a behaviorist and switched to cognitivism in the 1970s. In 1986, she became the President of the American Educational Research Association and in 1987, she gave her AERA Presidential Address, “Learning in School and Out.” In this talk, she laid out how learning that takes place in schools is often quite different from the learning that takes place outside of schools, drawing on recent research from fields such as anthropology. She discussed four key ways in which the kind of learning that takes place in everyday settings is different. In short, learning in school tends to focus on individual cognition and relies on abstract and general procedures (e.g., following a step-by-step procedure to solve a math equation in your head), while in the real world, learning and cognition are context-dependent, social, and can rely on the use of tools and objects in the environment. For example, Resnick (1987) cites Scribner’s (1984) anthropological study of how dairy workers go about doing their job in a dairy warehouse. One of the workers explained “how he filled an order for half a case” (Resnick, 1987):


I walked over and I visualized. I knew the case [of size 16] I was looking at had ten out of it, and I only wanted eight, so I just added two to it...I don’t never count when I’m making the order. I do it visual, a visual thing, you know. (Scribner, 1984, p. 26, cited in Resnick, 1987)


Resnick (1987) cites another study conducted by Olivia de la Rocha:


a person was observed solving the problem of measuring out three-fourths of two-thirds of a cup of cottage cheese. Instead of multiplying the fractions, he used a measuring cup to find ⅔ of a cup of cottage cheese. Then he patted the cheese into an approximately round pancake, divided it into quarters, and used three of the quarters. In this case, the cottage cheese itself served as part of the computation. (p. 14).


Resnick was introducing a new learning theory to education researchers: situativism (often called situated cognition, situated learning, or situativity theory). Situativism arose as a reaction to cognitivism. Many cognitivists (like Resnick) began to realize the limitations of cognitivism. As many people see it, cognitivism assumes cognition takes place in an individual head, while situativism suggests that cognition and learning are inherently social. Cognitivists assume cognition and learning involve abstract symbol manipulation and they studied this largely by observing people doing problem-solving tasks in lab studies, while situativists pay heed to the fact that cognition and learning are situated in a particular context (e.g., measuring a certain amount of cottage cheese vs. solving a pen and paper fractions problem).


In reality, cognitivists generally do acknowledge that context matters and that learning can be social. They would claim that symbol manipulation can actually take the context into account, and thus modeling the mind as a symbol-manipulating computer can actually extend to situativist theories as well (Vera & Simon, 1993). In practice, however, most cognitivist theories have been developed in the context of well-defined tasks, often involving a single person. The inherent complexity of real-world social contexts that situativists care about appear to be difficult to model computationally, and so situativists choose to study real-world cognition and learning using qualitative methods.


If the “cognitivist revolution” took place in 1956, we might say that the “situativist revolution” took place in 1987. In addition to Resnick’s talk, around the same time, several interdisciplinary books were published showing the limitations of artificial intelligence for not accounting for the situated nature of thinking (Suchman, 1987; Winograd and Flores, 1986). But perhaps the most significant event was the formation of the Institute for Research on Learning. James Greeno (1935-2020) was one of Resnick’s colleagues at LRDC who had also gone from being a prominent behaviorist researcher to being a leading cognitivist. He left the LRDC in 1984 to become a professor at the University of California, Berkeley. In 1987, along with John Seely Brown, he founded the Institute for Research on Learning (IRL) in Palo Alto, California—a West Coast version of the LRDC. Greeno would return to the University of Pittsburgh and the LRDC in 2003, and he remained there as a visiting professor until he passed away in 2020.


The IRL brought together researchers from different institutions and was one of the most important places for the development of situativism. Two researchers at the IRL, Jean Lave and Etienne Wenger, were working on their highly influential anthropological studies of learning. Lave began her career at the University of California, Irvine (UCI) in 1966 (the second year after UCI was founded) but later moved to the University of California, Berkeley and was affiliated with the IRL. Wenger was doing his PhD in computer science at UCI two decades later, studying the application of artificial intelligence to learning in the workplace. He came to realize that “artificial intelligence as it was conceived of was too narrow for such an enterprise” and that “the traditions of information-processing theories and cognitive psychology did address questions about learning but did so in a way that seemed too out of context to be useful” (Wenger, 1990, p. 3).


Lave and Wenger formulated a new kind of learning. Unlike behaviorists who saw learning as changes in behavior and cognitivists who saw learning as the acquisition of knowledge, Lave and Wenger conceptualized learning as becoming a participant in a community of practice. According to Wenger (1990):


The basic argument is that knowledge does not exist by itself in the form of information, but that it is part of the practice of specific sociocultural communities, called here “communities of practice.” Learning then is a matter of gaining a form of membership in these communities: this is achieved by a process of increasing participation, which is called here “legitimate peripheral participation.” Learning thus is tantamount to becoming a certain kind of person. (p. xv)


A variety of other theories can also be placed under the umbrella of situativism. Two of these have been particularly influential in education: distributed cognition and embodied cognition. Distributed cognition claims that cognition does not need to take place in an individual’s head, but rather is often distributed across people, tools, and artifacts. A simple case of distributed cognition is off-loading certain aspects of cognition to external tools or objects in the environment. For example, a person using a calculator in the process of solving a complex math problem is utilizing the calculator’s ability to handle certain computations (including computations like taking a square root or logarithm, which would be very difficult, if not impossible, for most people to do on their own). Even when solving a math problem using pen and paper, the person is off-loading the cognitive load of problem solving to an external tool. Interestingly, in de la Rocha’s study cited above, the individual who was trying to measure a certain amount of cottage cheese, actually used the cottage cheese itself to carry out the computation needed. Finally, people learning collaboratively might be able to reason through problems in ways that they could not do if they were working alone. Pea (1993) claims that the cognition is still being carried out in the heads of people, but the intelligence is distributed across the entire system of people, tools, and environment. Others claim that the actual “Cognitive processes ain’t (all) in the head!” (Clark and Chalmers, 1998). Either way, distributed cognition has implications for teaching, instructional design, and educational technology; broadly speaking, the environment, tools, and opportunities for collaborative learning can be designed in a way to take advantage of the way in which learners might distribute their cognition in the process of learning.


Embodied cognition claims that the individual’s body also plays an important role in cognition (Johnson, 1989). Children use their fingers in learning counting and arithmetic; children and adults use gestures when speaking. This can be interpreted as a specific form of distributed cognition (i.e., cognition is distributed between the mind and body, possibly in addition to tools, other people’s minds, and other people’s bodies). However, some researchers interpret embodied cognition as being more inherent than other forms of distributed cognition, since the body is biologically connected to the mind, and as such the mind cannot be understood in isolation of the body. Such researchers reject the mind-body dualism that many cognitivists and constructivists hold on to. Embodied cognition has many implications for education. For example, teachers can encourage students use their bodies as they learn, rejecting the idea that formalisms must be introduced prior to more applied and embodied ways of knowing (Nathan, 2012; Shapiro & Stolz, 2019). Moreover, students naturally use gestures during learning and cognition. These gestures may give clues as to a student’s understanding, including potential misunderstandings that a student might hold (Shapiro & Stolz, 2019).


The shift to the situated nature of cognition was a radical change, and one that many cognitivists could not accept. Such a theory could not account for the large amount of data that cognitivists had accumulated on how people solve formal problems and learn to do so. But many former cognitivists, such as Resnick, Greeno, Brown, and Wenger accepted that situativism is the only way to account for the kinds of learning that happens all the time outside of formal school settings and psychology laboratories. Figure 5 shows how situativism fits into the broader scope of cognitive science by connecting cognitive psychology with anthropology and philosophy. In addition to its roots in anthropology, situativist research was often motivated by the phenomenological philosophy of Martin Heidegger, Maurice Merleau-Ponty, Hubert Dreyfus, and others (Gallagher, 2008; Shapiro & Spaulding, 2021). While many situativists originally came from artificial intelligence, AI does not play a prominent role in situativity theory (unlike cognitivism and constructivism).












Lauren Resnick became an advocate of situativism in the 1980s.
















situativism emphasizes the fact that all learning is inherently situated in a social and cultural context












James Greeno was an advocate of situativism and founder of the Institute for Research on Learning.



Jean Lave and Etienne Wenger are known for the concept of community of practice.














distributed cognition is a particular situativist theory that puts emphasis on the distributed nature of cognition








embodied cognition is a particular situativist theory that puts emphasis on the use of the body in cognition

Socio-Cultural Theories

Situativists were not the first people to realize that learning was inherently social. It is worth repeating that situativism arose as a reaction to the limitations of cognitivism, and in doing so, evoked earlier socio-cultural theories of learning and combined them with the growing field of cognitive science. So where did socio-cultural theories come from?


About a year before Pavlov published his work on classical conditioning and only a few months after Piaget was born, another highly influential Russian psychologist was born by the name of Lev Vygotsky (1896-1934). Although he had a short life (due to tuberculosis), Vygotsky is now one of the most well-known psychologists in the world of education. Vygotsky was a developmental psychologist, but he is especially known for introducing what is often called a sociocultural or cultural-historical approach to developmental psychology. However, his influence especially in the US came long after his death. To understand how his thought entered the US psychology scene, we must turn to Michael Cole (b. 1938).


After completing his PhD in 1962, Cole spent a year in the Soviet Union, where he studied with one of Vygotsky’s students and another important Russian psychologist, Alexander Luria. A few years later, Cole served as a faculty member at UCI from 1966-1969, overlapping with Jean Lave’s time there. (Cole and Lave would collaborate with one another many years later.) Cole and some colleagues formulated a socio-cultural theory called cultural-historical activity theory, which was basically their interpretation of the thought of Vygotsky, Luria, and other Russian psychologists. According to this theory, learning and activity are situated in particular cultural and historical contexts and are mediated by tools. In 1978, Cole and others published an edited volume of translations of some of Vygotsky’s writings called Mind in Society. This was a highly influential text that spread Vygotsky’s popularity in the US. Citations of Vygotsky (1978) became so popular that I think many people would have likely believed Vygotsky himself wrote this book in the 1970s. Recently, many researchers have noted that the ideas presented in Mind in Society are not completely accurate renditions of Vygotsky’s thought. As such, the version of Vygotsky that people know may not be very accurate.


Perhaps Vygotsky’s most famous idea is the zone of proximal development (ZPD). According to the translation in Mind and Society, the ZPD is defined as:


the distance between the actual developmental level as determined by independent problem solving and the level of potential development as determined through problem-solving under adult guidance, or in collaboration with more capable peers (Vygotsky, 1978, p. 86)


This definition taken out of context misses two important things about the ZPD. First, a prevailing idea is that the intelligence of a child can be measured by looking at the current developmental state of the child (e.g., using an IQ test or SAT). The ZPD stands in contrast to the current developmental state of the child, by claiming that different children who are at the same current state, might be capable of developing various mental functions more or less rapidly, and an educator should take advantage of that. After providing the above definition, Vygotsky (1978) elaborates:


The zone of proximal development defines those functions that have not yet matured but are in the process of maturation, functions that will mature tomorrow but are currently in an embryonic state. These functions could be termed the “buds” or “flowers” of development rather than the “fruits” of development. (p. 86).


Vygotsky is thus trying to create a shift in how we think of development, from thinking about it “retrospectively” (after the fact) to thinking about it “prospectively” (before it occurs).


Second, some researchers have noted that more accurate translations do not suggest that the ZPD can be developed solely via learning from peers (Gredler, 2012). This misconception has led many to claim Vygotsky as a proponent of group and peer learning activities. According to Gredler (2012), Vygotsky claimed that children need adults to foster their development and conversations among groups of children “cannot serve as a source of any significant development for them” (Vygotsky, 1934/1994 cited in Gredler, 2012). Gredler (2012) also discusses other ways in which the ZPD has been misunderstood. Nonetheless, it is important to acknowledge that regardless of how Vygotsky initially intended it, the ZPD as understood by educators and other researchers has been incredibly influential in how people understand learning and socio-cultural theories.


Many now claim that Vygotsky was a social constructivist, in contrast to Piaget, whose constructivism was still limited to how knowledge construction happens in an individual’s head. However, Vygotsky did not use this terminology himself. Nonetheless, social constructivism has become a popular learning theory that combines aspects of Piaget’s constructivism and socio-cultural theories, like those of Vygotsky and siutativism. Indeed, Piaget’s student, Seymour Papert, the pioneer of constructionism, was adamant about the situated and embodied nature of cognition and learning. Even prior to the widespread emergence of these theories, Papert (1980) was arguing that objects like gears (Papert’s childhood obsession) or the Turtle in the Logo programming language can be used to learn mathematics, by allowing children to use their bodies in imagining themselves as those objects:


The gear can be used to illustrate many powerful “advanced” mathematical ideas, such as groups or relative motion. But it does more than this. As well as connecting with the formal knowledge of mathematics, it also connects with the “body knowledge,” the sensorimotor schemata of a child. You can be the gear, you can understand how it turns by projecting yourself into its place and turning with it. It is this double relationship—both abstract and sensory—that gives the gear the power to carry powerful mathematics into the mind (p. viii).


Many researchers and educators have accepted some mix of constructivism and socio-cultural theories, but might put more or less emphasis on one or the other depending on how important the social nature of learning or the impact of culture and context is to their work.








Lev Vygotsky is probably the most influential pioneer of socio-cultural theories in psychology.


Michael Cole helped introduce and popularize Vygotsky in the US through cultural-historical activity theory.




zone of proximal development (ZPD) is probably Vygotsky's most famous (but often misconstrued) concept in education

Case Studies

We have described four broad learning theories, which as mentioned at the outset amount to different worldviews on learning. However, to truly grasp these theories in a practically useful way, it is important to understand the predictions that they might make about learning in specific contexts. To do so, we will compare and contrast the predictions these theories might make in describing two phenomena. The first is a broad phenomenon: the acquisition of language. The second is the analysis of a specific type of arithmetic “misconception” or “bug” that children have.

Language Acquisition

How children come to acquire language has been a focal area of interest to some of the pioneers of the aforementioned theories. Take into consideration the following books that are considered hallmarks in the study of language acquisition, written by names that should look familiar:


  • Jean Piaget’s Language and Thought of the Child (1923)

  • Lev Vygotsky’s Thought and Language (1934)

  • B. F. Skinner’s Verbal Behavior (1957)

  • Noam Chomsky’s Syntactic Structures (1957), one of over 30 books Chomsky has written on linguistics


These author’s theories collided and resulted in debates. Vygotsky argued against some aspects of Piaget’s account of language in Thought and Language. Interestingly enough, Piaget did not know about Vygotsky’s work until 1962 when his book was first published in English, at which point he wrote a reply agreeing with many of Vygotsky’s points, but disagreeing with others (Piaget, 1962/1995). As mentioned above, Chomsky wrote an influential critique of Skinner’s Verbal Behavior, which resulted in a decline in behaviorism. In 1975 (over 50 years after Piaget’s first book on language!), Chomsky engaged in a public debate with Piaget in 1975 on the nature of “language and learning” (Marras, 1983; Piattelli-Palmarini, 1980). Finally, Roger Schank, a constructivist-leaning AI researcher who later founded the field of the learning sciences, attacked Chomsky’s position on language in academic articles, where he suggested language acquisition should focus on semantics, the meaning of words, rather than syntax, the grammatical structure, which was Chomsky’s focus (Brockman, 1996).


Here, I will only give the “headlines” of how these theories differed in their views on language acquisition; as such, I am going to gloss the details in these debates, which might matter a lot to the theorists, but do not matter so much for our purposes.


To a behaviorist like Skinner, language acquisition is the process of developing appropriate verbal responses to stimuli offered by a “verbal community” (Maria de Lourdes, 2012). Speaking and writing are examples of verbal behaviors, which have been reinforced probably through some long, complex chain of stimulus-response pairs (possibly dating back to childhood). Verbal behaviors do not necessarily arise naturally, but rather arise due to the reinforcement of members in the verbal community (e.g., parents, peers, teachers). For example, Skinner (1965) describes how parents reinforce certain verbal behaviors by providing positive reinforcement before they actually mean anything to babies:


Parents teach a baby to talk by reinforcing its first efforts with approval and affection, but these are not natural consequences of speech. The baby learns to say “mama,” “dada,” “spoon,” or “cup” months before he ever calls to his father or mother or identifies them to a passing stranger or asks for a spoon or cup or reports their presence to someone who cannot see them. The contrived reinforcement shapes the topography of verbal behavior long before that behavior can produce its normal consequences in a verbal community.


Similarly, babies or kids might repeat certain words simply because it evokes a certain reaction in adults (whether laughter or anger). If someone (whether child or adult) makes a grammatical or linguistic mistake, behaviorists would likely explain it as not having had the proper contingencies of reinforcement in their environment to evoke the proper use of language in that instance.


Recall that cognitivists view the mind as a computer that has a certain “pre-built” cognitive architecture. Learning does not result in drastic changes in the structure of the mind, but rather the addition of various pieces of information stored in memory or production systems. Thus, it makes sense that one component of this cognitive architecture should explain how people acquire and manipulate language. The cognitivist position on language is probably best represented in the work of Chomsky. He argued that language was essentially innate, whereby all children are born with a certain “language acquisition device (LAD),” which makes it easy to pick up language. While different languages have different grammatical rules, there are certain universal meta-rules that govern all languages, which Chomsky termed “universal grammar.” Thus, all people (regardless of cultural context) can quickly learn the grammatical rules of their native language simply by fine tuning the rules of this universal grammar. In this sense, much of language is acquired before a child is even born! Just as a computer can be pre-built to understand various programming languages, so is the human. Chomsky argued against Skinner by suggesting there is a “poverty of the stimulus,” meaning there are simply not enough verbal stimuli in a baby’s environment to reinforce language as quickly as it happens. More recently, researchers have provided evidence challenging the poverty of the stimulus argument (Pullum & Scholz, 2002; Lappin & Shieber, 2007.)


The constructivist position on language is naturally that language (as with everything else) is constructed over time. According to Piaget, children are not born with innate cognitive structures, but rather they have innate cognitive functions (like assimilation and accommodation) that enable them to construct cognitive structures (e.g., schemes) over time (Marras, 1983). Constructivists, like behaviorists, would deny that there is a poverty of stimulus. There are enough stimuli in the environment to construct the schemes needed to generalize language beyond the specific utterances that a child explicitly heard or read. If a person makes a linguistic mistake, it may be because the linguistic schemes that they developed were not correct or applicable in that situation, but they might still be useful in many ways and simply need to be modified.


Socio-cultural theorists naturally view language as an essentially social and cultural activity. Language does not exist in a vacuum; in fact it was determined by the cultural and historical context of a given community. As such, when learning a language, we are becoming enculturated into the practices of a given community. In a sense, this is remarkably similar to Skinner’s account of language (whereby a person learns to respond in a way that is acceptable to a verbal community), however, while behaviorists put emphasis on the reinforcement of behavior, socio-cultural theorists put emphasis on the cultural negotiation of language. If a person uses language in a seemingly “incorrect” way, it may not be incorrect at all, just not appropriate in that given socio-cultural context or to an outside observer. For example, one can come up with a seemingly incorrect, but highly efficient and personalized language to communicate with one’s siblings. One who grew up speaking African American vernacular English may need to learn to speak standard English in academic and professional settings, as well as learn code-switching to switch from one dialect to another depending on who they are talking to. An English novelist who just opened up a Twitter account will need to learn the norms of how to communicate effectively in that community.

Arithmetic

One more example will hopefully illustrate the different ways these theories might look at the same phenomenon. This example was borrowed from Cobb (1990), where he contrasts a cognitive and a constructivist explanation of the same common “mistake” that young children apparently make when faced with addition problems. Several researchers noted that when faced with addition problems, young children will often simply answer the last number plus one (e.g., 3 + 3 = 4 or 2 + 5 = 6). Siegler and Shrager (1984) “interpreted answers of this kind by first noting that most children know the counting-string up to ten long before they begin to add and subtract.” (Cobb, 1990, p. 78). Thus, they supposed that the 4-5-year old kids they studied apply a counting-like rule where they respond with the last addend plus one. This is a cognitivist explanation of the phenomenon, namely that children learn rules for solving arithmetic problems, and they simply acquired the wrong rule, by perhaps mixing it up with a more familiar rule on counting. Notice that a behaviorist explanation would likely be similar: these kids must have not been properly reinforced on addition problems, so they are probably giving response to the closest stimuli that they have been reinforced on (e.g., based on counting, 3 should evoke 4, 4 should evoke 5, etc.).


Neuman (1987) offered a constructivist explanation of this phenomenon that is markedly different. She found that many 7-year old children tend to give “names” to various mathematical objects. So when saying a number, they may not be referring to the actual numerical value, but rather using it as a name (similar to how we use variables in Algebra). So perhaps if asked “apple + orange = ?”, the child may choose the name “pineapple,” not because they necessarily think that a pineapple is a hybrid of an apple and an orange, but because it’s a familiar name that they can also use to refer to this new object (i.e., apple + orange). Notice that this is an entirely different account for how children think of such problems. As Cobb (1990) states, the constructivist interpretation focuses on the mathematical experience of the child (in which they make meaning of the symbols around them), while the cognitivist view focuses on procedural rules that children use when faced with certain stimuli.


Cobb (1990) did not offer a socio-cultural interpretation of this event, but we might imagine that a socio-cultural theorist or situativist might put emphasis on the possibility that the child actually does know how to do addition, but only displays this ability in contexts where such a skill is needed. This would have to be tested of course, and may not be true, but a situativist might highlight that just because the child cannot add symbols together, does not mean they do not understand the concept behind addition. This does not explain why the child answered what they did, but suggests that the child’s answer to this question may be missing the point of what the child actually understands.


Notice that in some cases two different learning theories might actually result in similar explanations for the same phenomenon, but even when they do, they might put emphases on different aspects of the explanation. Interestingly, which learning theories coincide will depend on what is being explained. In the case of language acquisition, Chomsky’s cognitivist position was starkly in contrast to the behaviorist position. The behaviorist, constructivist, and socio-cultural views of language acquisition are all similar in the sense that they believe the child can gradually learn language from seeing enough examples of spoken utterances. However, in explaining the arithmetic example, we suggested that cognitivists and behaviorists would basically make the same predictions, possibly in stark contrast to the constructivist and socio-cultural interpretations.

Many researchers and educators often conflate learning theories with theories of instruction. For example, when people talk about constructivism, they are often actually talking about constructivist instruction, such as discovery learning.

Descriptive vs. Prescriptive Theories

Phew, that was a lot of information, but you should now have some handle on the variety of ways that researchers have studied learning for over a century! At this point, another clarification is in order. We must distinguish between descriptive theories and prescriptive theories. A descriptive theory describes things the way they are. A prescriptive theory prescribes how things should be. A learning theory is descriptive, because it explains how people actually learn. A theory of instruction, on the other hand, is a prescriptive theory because it suggests how people should teach. Our primary focus above has been on learning theories. However, the importance of learning theories in the context of education is for the most part how much guidance they can offer on how to teach effectively (or how learners should best learn on their own). Many learning theorists were cognizant of the instructional implications of their work and contributed to developing prescriptive theories as well. Indeed, in each of the sections above, some statements were actually prescriptive suggestions. A few sentences from the sections above that point out the prescriptive implications of learning theories have been repeated below, with phrases that indicated the prescriptive nature of the statements shown in red:


This was early scientific evidence for the importance of giving learners immediate feedback, which has become an important “best practice” used in a lot of educational technology.


Skinner believed practically anything could and should be taught using the appropriate sequence of stimuli and reinforcements. This turned into the concept of programmed instruction, where a student would be “programmed” by answering a series of successive questions with immediate feedback.


According to cognitive load theory, one should design instructional materials to minimize extraneous cognitive load.


Constructivism has been used to advocate for the use of discovery learning or inquiry-based learning in the classroom, whereby the students have to discover concepts and procedures for themselves (under the guidance of the teacher and/or peers), rather than being directly instructed on what they need to know.


Either way, distributed cognition has implications for teaching, instructional design, and educational technology; broadly speaking, the environment, tools, and opportunities for collaborative learning can be designed in a way to take advantage of the way in which learners might distribute their cognition in the process of learning.


Many researchers and educators often conflate learning theories with theories of instruction. For example, when people talk about constructivism, they are often actually talking about constructivist instruction, such as discovery learning. While different learning theories could have substantially different implications for instruction, it is important to note two things. First, proponents of the same theory might come up with different instructional implications from that theory. This is especially the case, when a theory is not a very precise account of how people learn. For example, socio-cultural theories and constructivism could be interpreted as making a lot of different suggestions about how to teach, although they will generally result in learner-centered pedagogies that might involve learning from peers or mentors. Second, two learning theories that might seem very different can at times agree on pedagogical considerations. For example, behaviorism and cognitivism might seem quite different from one another, but advocates of both often agree that teaching should constitute breaking down a subject into component parts and directly teaching each of those in sequence to a student. To a behaviorist, teaching each of those might mean resulting in a behavioral change in a student, while to a cognitivist it might mean helping them acquire the correct rules and store them into their long-term memory. But at the end of the day, the methods of instruction might be quite similar, if not exactly the same.

Conclusion

At this point, it might be worth summarizing how each of the theories discussed above view learning in terms of simple “catchphrases.” By themselves, these catchphrases are oversimplifications, but hopefully they are useful when considered in the context of the discussion above, which outlines the nuances and historical evolution of these theories:


  • Behaviorism: Learning is change in behavior.

  • Cognitivism: Learning is the acquisition of knowledge.

  • Constructivism: Learning is the active construction of knowledge.

  • Situativism: Learning is becoming a participant in a community of practice.


Whole courses could be taught on learning theories. Each of these theories has a lot of nuance and is supported by many empirical studies. Nonetheless, I have tried to outline the key contours of each of these theories, the main variants of each theory, and what sets them apart from one another. As a reminder, we are looking at these theories as worldviews rather than precise psychological theories of how people learn. Educators will often adopt one or more of these worldviews without knowing the particularities of all the theories, or necessarily knowing the key researchers and terminologies behind them. Educational technologies are often created with one or more of these worldviews in mind. At times, a learning theory is very explicitly utilized in the creation of a technology; indeed, many of the researchers discussed in the past few pages were actively involved in creating technologies that incorporate their ideas on how people learn. At other times, a technologist might create an educational technology without knowing what learning theory they are espousing. But one cannot teach, study how people learn, or create a technology without having some notion of how people learn. By more concretely understanding how pioneering researchers have studied learning for over a century, the hope is that you will be better equipped to be a member of the educational ecosystem (whether as a teacher, a researcher, a technology developer, or even just as a lifelong learner) than if you were to create your own theories from scratch.

One cannot teach, study how people learn, or create a technology without having some notion of how people learn.

Acknowledgements

I would like to thank the many individuals who gave useful feedback at various stages of the formation of this primer. I chose not to write all their names here in case they do not wish to be associated with the document and in fear that I will miss some names.

References

Brockman, J. (1996). Third culture: Beyond the scientific revolution. Simon and Schuster.


Clark, A., & Chalmers, D. (1998). The extended mind. Analysis, 58(1), 7-19.


Day, W. (1983). On the difference between radical and methodological behaviorism. Behaviorism, 11(1), 89-102.


Gallagher, S. (2008). Philosophical antecedents to situated cognition. In Robbins, P. and Aydede, M. (Eds). Cambridge Handbook of Situated Cognition. Cambridge: Cambridge University Press.


Gardner, H. (1987). The mind's new science: A history of the cognitive revolution. Basic Books.


Graham, G. (2019). Behaviorism. In Edward N. Zalta (Ed.). The Stanford Encyclopedia of Philosophy. https://plato.stanford.edu/archives/spr2019/entries/behaviorism/.


Gredler, M. E. (2012). Understanding Vygotsky for the classroom: Is it too late?. Educational Psychology Review, 24(1), 113-131.


Johnson, M. (1989). Embodied knowledge. Curriculum Inquiry, 19(4), 361-377.


Klahr, D., Langley, P., & Neches, R. (Eds.) (1987). Production system models of learning and development. Cambridge, MA: MIT Press.


Lappin, S., & Shieber, S. M. (2007). Machine learning theory and practice as a source of insight into universal grammar. Journal of Linguistics, 393-427.


Maria de Lourdes, R. D. F. (2012). BF Skinner: the writer and his definition of verbal behavior. The Behavior Analyst, 35(1), 115-126.


Marras, A. (1983). Reviewed work(s): Language and learning: The debate between Jean Piaget and Noan Chomsky. Canadian Journal of Philosophy, 13(2), 277-291.


Miller, G. A. (2003). The cognitive revolution: a historical perspective. Trends in Cognitive Sciences, 7(3), 141-144.


Nathan, M. J. (2012). Rethinking formalisms in formal education. Educational Psychologist, 47(2), 125-148.


Newell, A. (1973). Production systems: Models of control structures. In Visual Information Processing (pp. 463-526). Academic Press.


Papert, S. (1988). A critique of technocentrism in thinking about the school of the future. In Children in the Information Age: Opportunities for Creativity, Innovation and New Activities. (pp. 3-18). Pergamon.


Papert, S. (1999). Papert on Piaget. Time Magazine’s special issue on The Century’s Greatest Minds, 105.


Pavlov, I. (1904) Nobel Lecture. Nobel Media AB 2020. https://www.nobelprize.org/prizes/medicine/1904/pavlov/lecture/


Pea, R. D. (1993). Practices of distributed intelligence and designs for education. Distributed cognitions: Psychological and educational considerations, 11, 47-87.


Piaget, J. (1962/1995). Commentary on Vygotsky's criticisms of Language and Thought of the Child and Judgment and Reasoning in the Child (L. Smith, Trans.). New Ideas in Psychology, 13, 325-340.


Piattelli-Palmarini, M. (Ed.). (1980). Language and learning: the debate between Jean Piaget and Noam Chomsky. Harvard University Press.


Pullum, G. K., & Scholz, B. C. (2002). Empirical assessment of stimulus poverty arguments. The linguistic review, 18(1-2), 9-50.


Resnick, L. B. (1987). The 1987 presidential address: Learning in school and out. Educational Researcher, 16(9), 13-54.


Scribner, S. (1984). Studying working intelligence. Everyday cognition: Its development in social context, 9-40.


Shapiro, L., & Spaulding, S. (2021). Embodied cognition. In Edward N. Zalta (Ed.). The Stanford Encyclopedia of Philosophy. https://plato.stanford.edu/archives/fall2021/entries/embodied-cognition/.


Shapiro, L., & Stolz, S. A. (2019). Embodied cognition and its significance for education. Theory and Research in Education, 17(1), 19-39.


Simon, H. A. (1980). Cognitive science: The newest science of the artificial. Cognitive Science, 4(1), 33-46.


Skinner, B. F. (1962). Two “synthetic social relations”. Journal of the Experimental Analysis of Behavior, 5(4), 531.


Skinner, B. F. (1963). Behaviorism at fifty. Science, 140(3570), 951-958.


Smith III, J. P., Disessa, A. A., & Roschelle, J. (1994). Misconceptions reconceived: A constructivist analysis of knowledge in transition. The Journal of the Learning Sciences, 3(2), 115-163.


Suchman, L. A. (1987). Plans and situated actions: The problem of human-machine communication. Cambridge University Press.


Skinner, B. F. (1965). Review Lecture-The technology of teaching. Proceedings of the Royal Society of London. Series B. Biological Sciences, 162(989), 427-443.


Thorndike, E. L. (1898). Some experiments on animal intelligence. Science, 7(181), 818-824.


Thorndike, E. L. (1907). The elements of psychology. AG Seiler.


Thorndike, E. L. (1927). The law of effect. The American Journal of Psychology, 39(1/4), 212-222.


Vera, A. H., & Simon, H. A. (1993). Situated action: A symbolic interpretation. Cognitive Science, 17(1), 7-48.


von Glasersfeld, E. (1990). Chapter 2: An Exposition of Constructivism: Why Some Like It Radical. Journal for Research in Mathematics Education. Monograph, 4, 19-210.


von Glasersfeld, E. (1995). Radical constructivism: A way of knowing and learning. The Falmer Press.


von Glasersfeld, E. (2001). The radical constructivist view of science. Foundations of Science, 6(1), 31-43.


Vygotsky, L. S. (1980). Mind in society: The development of higher psychological processes. Harvard University Press.


Vygotsky, L. S. (1994). The development of academic concepts in school-aged children. In R. van der Veer & J. Valsiner (Eds.), The Vygotsky Reader (pp. 355–370). Cambridge: Blackwell.

Watson, J. B. (1913). Psychology as the behaviorist views it. Psychological Review, 20(2), 158.

Wenger, E. (1990). Toward a theory of cultural transparency: Elements of a social discourse of the visible and the invisible.


Winograd, T., Flores, F. (1986). Understanding computers and cognition: A new foundation for design. Intellect Books.