Psy 120.3 Lecture Nine November 22 2023
Learning
Learning is defined as the acquistion of experience, of new knowledge,
skills orresponses that results in a relatively permanent change in the state of
the learner.
All learnng begins with association : sensory experience connects to emotions,
which lead to cognitive decisions (sometimes conscious, sometimes not).
Habituation : a general process in which repeated or prolonged exsposure to a stimulus results in a gradual reduction in responding.
Kandel, 2006: Aplysia exhibits another form of learning– sensitization --which
occurs when presentation of a noxious stimulus leads to increased response to a later stimulus.
This leads us to Pavlov and classical conditioning . Classical conditioning occurs when a neutral stimulus produces a response after being paired with a stimulus that naturally produces a response.
US: unconditioned stimulus -produces a reliable, naturally occurring
reaction for an organism.
UR: unconditioned response -a reflexive reaction that is reliably
produced by an unconditioned stimulus. Puppies already drool at the sight of food.
CS: conditioned stimulus -a previously neutral stimulus the produces a
reliable response in an organism after being paired with an US.
CR: conditioned response -a reaction that resembles an unconditioned response but is produced by a CS.
Elements of Classical Conditioning
or, Memorize Fig. 7.2 'The Elements of Classical Conditioning' for the next exam.
Acquisition: the phase of classical conditioning when the CS and the US are presented together. After learning is established, the CS by itself will reliably elicit the CR.
Second-order conditioning : although money is not directly associated
with the thrill of a new sports car, it is directly associated wiht the CS that
results in gratifyng outcomes. Such repeated exposure means that eventually money is desirable for its own sake.
Extinction : the gradual elimination of a learned response that occurs
when the CS is repeatedly presented without the US. Based on Long Term Potentiation.
Spontaneous Recovery is typically defined as the reemergence of conditioned responding to an extinguished conditioned stimulus (CS) with the passage of time since extinction. Based on Long Term Potentiation.
Generalization: the CR is elicited even though the stimulus is slightly different than the CS used during acquisition.
Discrimination: the capacity to distinguish between similar but distinct stimuli.
Inset: The Real World Siegel's 2016 work with drug overdoses: why do experienced drug users die from overdoses in novel environments? How does Pavlovian conditioning apply to this situation?
Cognitive & Neural Elements of Classical Conditioning
Pavlov's dogs were sensitive to the fact that he was not a reliable indicator of the arrival of food.
Rescorla and Wagner (1972) were the first to theorize that classical conditioning occurs when the animal has learned to set up an expectation This in turn leads to an array of behaviours associated with the presence of the CS.
Their model predicted that conditioning would be easier when the CS was an unfamiliar event. Classical conditioning incorporates a significant cognitive element.
Studies of classical conditioning in humans indicate that conditioning
can occur without conscious awareness of the relationship between the CS and the US.
Thompson (2005) demonstrated that the cerebellum critical for the occurrence of eyeblink (classical) conditioning.
The central nucleus of the amygdala is critical for fear
conditioning, such as biological freezing (LeDoux et al., 1988)
Evolution & Classical Conditioning
Behaviours that are adaptive allow an organism to survive and thrive in its environment.
Garcia & Koellig, 1966 used a variety of CS that caused nausea and vomiting in rats hours later. They found weak or no conditioning when the CS was visual, auditory, or tactile, but strong food aversion.There is however, an error in the text: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3622671/ explains why rats cannot vomit.
Evolution has provided species with biological preparedness, a propensity for learning particular kinds of associations but not others. For example,birds depend primarily on visual cues for finding food and are relatively insensitive to taste and smell.
It is relatively easy to produce a food aversion in birds using an unfamiliar stimulus as the CS, such as brightly coloured food. This difference–smell for mammals, vision for birds–has its roots in the
Triassic period, about 220 million years ago.
Mammals: https://news.stanford.edu/2017/04/20/genetic-evidence-points-nocturnal-early-mammals/
Birds: https://www.earthsciencefrontiers.net.cn/EN/Y2020/V27/I4/294
Operant Conditioning
Operant conditioning : a type of learning in which the consequence's of an organism's behaviour determine the likelihood of that behaviour being repeated in the future.
Fig. 7.6 Thorndike's puzzle box yields the Law of Effects and is an example of such instrumental behaviours .
At first, the cat enacts any number of likely but unsuccessful behaviours, but only one leads to food and freedom. Over time, the unsuccessful behaviours become less frequent, and the one instrumental behaviour becomes more so. Thorndike's work resonated with most behaviourists at the time: it was still observable, quantifiable, and free from explanations involving the mind.
B.F. Skinner redefined this as: operant behaviour that an organism
produces that has some impact on the environment. Most organisms actively engagethe environment to reap rewards.
Reward OR Punishment
Reinforcer: any stimulus or event that functions to increase the likelihood of the behaviour that led to it.
Punishment: any stimulus or event that functions to decrease the likelihood of the behaviour that led to it.
Reinforcement increases the likelihood of behaviour
Punishment decreases the likelihood of behaviour
Stimulus is presented
Positive reinforcement: parents buy a teen a new car as a reward for safe driving
Positive punishment: Parents assign difficult new chores after teen is stopped for speeding
Stimulus is removed
Negative reinforcement: Parents reduce restrictions to where a teen can drive,
a reward for safe driving
Negative punishment: Parents suspend driving privileges after teen is stopped for speeding.
Reinforcement is generally more effective than punishment in promoting learning. Why?
Punishment signals that an unacceptable behaviour has occurred, but does not specify what should be done instead.For study purposes, think R vs.P; they are essentially two separate systems of operant conditioning.
Reinforcers & Punishers
Primary reinforcers satisfy biological needs.
Secondary reinforcers derive their effectiveness from associations with primary reinforcers. Eg: bitcoin is a neutral CS, until paired with food or shelter.
A key determinant in the effectiveness of a reinforcer is the amount of time between the occurrence of the behaviour and the reinforcer. The more time that elapses, the less effective.
The greater potency of immediate versus delayed reinforcement makes it difficult to quit smoking or to lose weight. Consider the opposite: the longer the delay between a behaviour and the administration of a punishment, the less effective in suppressing the target behaviour. Finally, punishment can be turned into reward: Vor y Zakone.From Wikipedia: A "thief in law" (or thief with code, Russian: вор в зако́не, romanized: vor v zakone) in the Soviet Union, the post-Soviet states, and their respective diasporas is a formal and special status of "criminal authority", a professional criminal who follows certain criminal traditions and enjoys an elite position.
Fitness people often speak proudly of 'punishing training' in the gym, usually when preparing for an event.
That is an example of positive punishment.
Extinction occurs in both classical and operant conding. The response rate drops off fairly rapidly, and if a rest period is provided, spontaneous recovery is typically seen. Again, this points to long term potentiation, and neural structures.
There is however, an important difference. In classical conditioning the US occurs on every trial, whereas in operant conditioning, the reinforcements only occur when the proper response is made, and not always even then.
Reinforcement Schedules
Fixed Interval : Christmas
Variable Interval : Start Up / Semester (best indicator of life success)
Fixed Ratio : Piecework
Variable Ratio : Gambling
Variable interval schedules typically produce steady, consistent responding because the time until the next reinforcement is less predictable.Although a semester system seems like a fixed interval, it is not, as the problems encountered during a semester have less predictable reinforcements. So, I don't agree with the text here. This schedule is what produces billionaires.
Intermittent reinforcement : only some of the responses made are followed by reinforcement. The more irregular and intermittent a schedule is, the more difficult it becomes for an organism to detect when the schedule is actually about to become extinct.
Shaping by Successive Approximations
Shaping: the reinforcement of successive steps to a final desired
behaviour. The outcome of one set of behaviours shapes the next set of behaviours.
A small reward is given for each behaviour that is approximating the
final goal. Note that there is no element of punishment.
Any behaviour that is accidentally but successfully reinforced will be
repeated,and this can result in idiosyncratic, superstitious behaviours.
Accidental relationships can therefore appear to have a cause-and-effect
chain.
Bloom et al., 2007 reported that reinforcing human adults or children using schedules in which reinforcement is not contingent on their responses can produce seeming superstitious behaviour. Most human superstitions can be attributed to this kind of scheduling. In the philosophy of science, it is called a 'post hoc' fallacy (from the Latin: 'post hoc,ergo propter hoc'), or 'after this, therefore because of this'.
Cognitive Elements of Operant Conditioning
Tolman proposed that animals established a means-end relationship. Conditioning experience produced knowledge: a specific reward (end-state) will appear if a specific response (the means) is made. The stimulus does not directly evoke a response; rather, it establishes an internal cognitive state which then produces the behaviour.
He gave three groups of rat access to a complex maze every day over a span of 17 days. The control group never received any reinforcement for navigating the maze. The second group received regular reinforcements of food. This group showed clear learning. The third group was treated like the control group for the first 10 days, and like the second group for the last ten days. For the first ten days, they acted like the control group, but for the last seven days, like the second group. Even though they did not receive reinforcements. they exhibited latent learning. Latent learning : something is learned, but is not manifested in a behaviour change until sometime in the future.
Cognitive map : a mental representation of the physical features of the environment. Fig. 7.11
T
he rats had formed a cognitive map of their environment and and knew where they needed to end up spatially, compared to where they began. Strict behaviourism theorized that the rats would have backtracked, and tried the next entrance on other side of their first attempt.
Neural Elements of Operant Conditioning
Olds and Milner (1956) discovered that when placing electrodes in a rat's brain,and allowing the rat to stimulate itself, it would do so, ignoring food and water. The likely cause was the dopamine pathway to the nucleus accumbens .
During recent years, several competing hypotheses about the precise role of dopamine have emerged: (1) it is more closely linked with the expectation of reward (rather than reward itself); (2) it more closely associated with wanting orcraving something than simply liking it.
Drugs such as cocaine, amphetamines and opiates activate the dopamine pathway; but dopamine-blocking drugs dramatically diminish theirreinforcing efects.
FMRI studies show activity in the nucleus accumbens in heterosexual men when looking at pictures of attractive women, and in individuals who believe they are about to receive money.These biological structures evolved to ensure that a species engage in activities that aid survival and reproduction.
Evolutionary Elements of Operant Conditioning
A complex T maze simulates the natural environment. Like many other foraging species, rats placed in a complex t-maze show evidence of their evolutionary preparedness. These rats will systematically travel from arm to arm in search of food, never returning to the arms they have already visited.
Breland & Breland, 1961 reported that pigs are biologically predisposed to root out their food, just as raccoons are predisposed to wash their food.Trying to train either species to pick up a coin and drop it in a box had ironic consequences.
Observational Learning in Humans
Observational learning : a condition in which learning takes place by watching the actions of others. Even complex motor skills, such as surgery, are learning in part through extensive observation and imitation of models.
Beating Up Bobo': (Bandura, 1961, 1963, 1977). The adult model purposely used novel behaviours so that the researchers could distinguish aggressive acts that were clearly the result of observational learning.When they saw adult models being punished for behaving aggressively, the children showed considerably less aggression.When they saw adult models being rewarded and praised (secondary reinforcement) for aggressive behaviour, they displayed an increase in aggression.
Diffusion chain: a process in which individuals initially learn a behaviour by observing another individual perform that behaviour, and then serve as a model from which other individuals learn the behaviour.Studies have shown that observational learning sometimes results in just as much learning as practising the task itself. (Heyes & Foster. 2002).
Neural Elements in Observational Learning
Mirror Neuron System : fires when a primate performs an action, and they also fire when the primate watches another of its own or similar species perform the same action. (Monkey see, monkey do.)
Fig. 7.17 Mirror neurons are thought to be represented in specific
sub-regions in the frontal and parietal lobes.
If appropriate neurons fire when another organism is performing an
action, it could indicate the awareness of intentionality, or that the animal is anticipating a likely course of future actions.
Observational learning also relies on the motor cortex. To examine whether observational learning depends on this area, TMS was applied just after participants observed performance of a reaching movement, causing a temporary disruption in that brain region.
Applying TMS to the motor cortex greatly reduced the amount of observational learning, whereas applying TMS to a control area outside the motor cortex had no effect on observational learning
Implicit Learning: Cognitive & Neural
Implicit learning takes place largely independent of awareness of both the process and products of information acquisition. We are attuned to linguistic, social,emotional, or sensorimotor events in our environment that internal representations are gradually built up without explicit awareness. Explicit learning becomes implicit over time.
Reber, 1967: Artificial grammar experiments: Participants gradually developed a vague, intuitive sense of the 'correctness' of particular letter groupings. They became quite good at this task, but were unable to demonstrate much in the way of explicit awareness of the rules and regularities they were using.
Implicit learning differs very little from person to person; is unrelated to I.Q.; infants are just as good at it as university students; it changes little across the lifespan. Implicit learning is remarkably resistant to various disorders that effect explicit learning.
Remember this section of the text for your next long essay: amnesic
patients show normal implicit memories, and display normal implicit learning.(Knowlton, 1992). Sammy Jenkis should have learned that the pyramid would give a shock.
Implicit & Explicit Learning: Neural Pathways
'Implicit & Explicit Learning Activate Different Brain Areas'. The hippocampus and nearby structures in the medial temporal lobe do not seem to be necessary for implicit learning.
Reber et al., 2002 : Array of Stars participants who were given explicit instructions as to how to determine the underlying prototypical dot pattern showed increased activity in the prefrontal cortex, parietal cortex, and hippocampus, all associated with the processing of explicit memories. Those given the implicit instructions showed decreased activity in the occipital region (which is involved in visual processing). Distinct brain structures were recruited in different ways depending on explicit or implicit instructions.
Forkstam et al., 2006: Broca's area is turned on during artificial grammar learning.Activating Broca's area by applying electrical stimulation to the nearby scalp area enhances implicit learning.