The main features of operant conditioning, including types of reinforcement (positive and negative) and punishment (positive and negative) and properties of reinforcement (primary, secondary and schedules) including researching Skinner (1948) Superstition in the pigeon.
Operant conditioning is a type of learning in which a new voluntary behaviour is associated with a consequence - reinforcement makes the behaviour more likely to occur, while punishment makes it less likely to occur. Voluntary behaviours are actions that can be controlled by the organism, such as running, writing an essay or skydiving.
The term “operant conditioning” was coined by BF Skinner, but follows the “law of effect” that was first stated by Edward Thorndike:
"Responses that produce a satisfying effect in a particular situation become more likely to occur again in that situation, and responses that produce a discomforting effect become less likely to occur again in that situation."
To study operant conditioning in as scientific a way as possible, Skinner created an experimental tool called the Skinner box that allowed complete control of the organism’s environment, the behaviours that were available to it and the reinforcement or punishment it would receive. Skinner investigated how the type of reinforcement or punishment given and the rate of reinforcement or punishment affected the rate of learning.
In a typical experiment, a rat or pigeon would be put into the Skinner box in which temperature, light and noise could be kept constant. On one wall of the box, there would be a lever and a hopper that could deliver a food pellet to the animal when the lever was pressed. Initially, the rat is likely to wander around the box aimlessly until it accidentally presses the lever and receives a food pellet. Skinner would leave the animal in the box and measure how frequently the animal pressed the lever over time. The frequency should indicate the strength of the conditioning of the behaviour. This would then be repeated with other animals.
To explain the process of operant conditioning, you need to be aware of several terms:
A consequence that makes a behaviour more likely to occur.
Positive reinforcement rewards the desired behaviour by adding something pleasant – food, affection, a compliment, money.
Negative reinforcement rewards the desired behaviour by removing something unpleasant– taking away pain or distress, stopping criticism, cancelling a fine.
There’s also primary reinforcement, which is when the reward is something we want naturally – a basic need such as food, warmth or affection. Secondary reinforcement is a reward we have learned to value – like money.
A consequence that makes a behaviour less likely to occur. Punishment is when undesirable behaviour produces unpleasant consequences. Again, there is positive punishment, which punishes the undesirable behaviour by adding something unpleasant (a shock, a criticism, copying out lines), and negative punishment, which punishes by removing something pleasant (being 'grounded', deducting money, removing the Xbox).
Often, punishment combines both types: a detention involves adding something unpleasant (work) and taking away something pleasant (your break time).
It is also important to be aware of the difference between positive and negative consequences. Positive consequences involve giving something and negative involve taking something away.
A rat in a Skinner’s box that was given positive reinforcement might receive a food pellet every time it pressed a lever and should learn to press the lever more often. A rat in a Skinner’s box that was given negative reinforcement might have an electric shock turned off if they press a lever, and should also learn to press the lever more often. A rat in Skinner’s box that had its heat turned off when it pressed the lever would be receiving negative punishment, and should learn to avoid the lever.
Your teachers or lectures might be using positive reinforcement if they give you rewards for work. They would be using negative reinforcement if they say, “Unless you do your homework, you are in detention.” It is also present in computer games. The positive reinforcement you get from completing a level of a game can drive you on to try the next level, and might explain some addictions.
Operant conditioning also plays a secondary role in explaining phobias. Avoidance learning occurs when moving away from the source of the phobia provides negative reinforcement for the behaviour by reducing anxiety. This can help to maintain phobias once they have been learned.
A lot of Skinner’s research was how often a reward needs to happen before behaviour is learned. He discovered four “schedules” that work.
Fixed interval: The reward turns up at a regular time. Desirable behaviour increases in the run-up to the reward. This happened with Skinner’s pigeons. It might happen with humans at work if there is a regular tea break or “casual Friday”. Learning is medium and extinction (learned behaviour fading) is medium.
Variable Interval: The reward turns up but you can’t be sure exactly when. An example might be the audience applauding a performer or cheering an athlete. Desirable behaviour increases more slowly but stays at a steady rate. Learning is fast but extinction is slow.
Fixed Ratio: The reward turns up every time the desired behaviour is carried out so often. Skinner’s rats got a reward every time they pressed the lever. A human might get paid for every 100 products they build. If you don’t do the behaviour, you get nothing; if you work fast, you get a lot. Learning is fast and extinction is moderate.
Variable Ratio: The reward is dispensed randomly, after a changing number of behaviours, such as feeding the rat after one lever-press, then after 5, then after 3. For humans, this might be like a slot machine because you don’t know how many times you’ll have to pay in before it pays out. Learning is fast and extinction is slow.
Skinner (1948) carried out a famous experiment called “Superstition in the Pigeon”. Eight pigeons were starved to make them hungry then put in a cage. At regular intervals every 15 seconds, a food dispenser would swing into the cage for 5 seconds then swing out again. When the food was due to appear, the pigeons started showing strange behaviours, such as turning anticlockwise who making swaying motions.
Skinner concluded the pigeons were repeating whatever behaviour they had been in the middle of doing when the reinforcement was first offered to them. Because the food kept reappearing, this senseless behaviour was strengthened. This is like a “superstition” when humans imagine that, by doing something senseless (knocking on wood, crossing their fingers) they can make something pleasant happen.
It is a better explanation than classical conditioning as it focuses on learning new voluntary behaviours rather than just reflexive behaviour.
Ramesh et al. (2011) found that when positive and negative reinforcement was used in a neonatal intensive care unit noise levels were reduced by the staff showing it does work on humans.
Researchers found that those who received more negative down votes on social media posted more comments in the future compared to commentators who received positive up votes, which operant conditioning cannot explain.
Vaughan et al. (2014) found that calves could learn to associate a stall with urinating through the use of rewards, showing it can teach calves a specific behaviour.
Vaughan et al. (2014) carried out their study on calves, so operant conditioning may not be as effective on humans as there are problems generalising the results.
Operant conditioning may not be a complete explanation of human behaviour as it ignores the influence of hormones on our behaviour, such as testosterone and aggression.
Identify the secondary reinforcer used by the teacher. (1) January 2020
Describe the schedule of reinforcement the teacher is using with Georgia. (2) January 2020
Georgia’s mother decides to use positive punishment to make Georgia do her homework. Describe how positive punishment could be used to teach Georgia to do her homework. (2) January 2020
Andrija decides to use operant conditioning to encourage his son, Alexi, to make his bed. Explain one weakness of Andrija using operant conditioning to encourage Alexi to make his bed. (2) January 2018
Ore also likes to be outside playing cricket with his friends. His mother gives Ore his favourite food when he comes back home after playing cricket. Describe, using positive reinforcement, one reason why Ore likes to be outside playing cricket with his friends. (2) June 2018
Explain one weakness of Skinner's (1948) Superstition in the Pigeon study. (2) June 2019
Describe the following features of operant conditioning using an example from the context above. (a) Positive reinforcement (2) (b) Negative reinforcement. (2) June 2019
Describe the results and/or conclusion of Skinner's (1948) Superstition in the Pigeon study. (3) June 2019
Rina wants to encourage her three-year-old daughter Sangita to clean her teeth. Describe how Rina could encourage Sangita to clean her teeth using principles from operant conditioning. (4) January 2017
Andrija wants to train his cat to come into the house when he calls its name. Andrija decides to use operant conditioning to train his cat. Describe how Andrija could use operant conditioning to train his cat to come into the house when he calls its name. (4) January 2018
Compare classical conditioning and operant conditioning. (6) June 2017
Peter is 3 years old and his father is concerned about his safety near roads. He is worried Peter will be in an accident as he often runs into the road without looking to see if any vehicles are coming towards him. His father is trying to teach Peter to be more careful near roads. Discuss, using operant conditioning, how Peter’s father could teach him to be more careful near roads. You must refer to the context in your answer. (8) October 2017
To what extent does operant conditioning explain human behaviour? (12) January 2019