Research Projects

According to the computational view of the human nervous system, its main purpose is to process sensory information (input) in a way that allows it to choose a motor response (output). The optimal response is the one that maximizes its expected rewards or utility, and thus its fitness in the environment.

My research examines the computational properties of this process by using ideas from machine learning to build optimal models of inference (Bayesian) and decision making (reinforcement learning). We test these models using psychophysics, fMRI and pharmacology.

Bayesian inference

Perception is a process that requires converting noisy and uncertain information into a form that allows the organism to interact with its environment. Given an uncertain environment, a Bayesian statistical approach to inferring perceptual stimuli provides the optimal estimate for an observer.

In a realistic environment we experience a large number of stimuli from different modalities, and will not know the causal relationships of the sources giving rise to these stimuli. If we knew the causal structure we would be able to utilize that to increase the accuracy of our estimates, e.g. by combining information about stimuli known to originate from the same sources.

We have previously created a Bayesian model that is able to infer the underlying causal structure of a set of stimuli, providing an optimal basis for forming perceptual decisions. We tested whether subjects when judging locations of audio-visual stimuli were implicitly inferring the causal structure giving rise to the stimuli, and found their performance to be very close to such an optimal model (Beierholm et al. 08, Koerding, Beierholm et al. 07).

This surprising result implies that humans even at a perceptual level are constantly inferring the causal structure of their surroundings, a task that has previously been thought to be a ‘high-level’ cognitive task. In further experiments we expanded on these ideas for cases of up to three stimuli (Wozny, Beierholm & Shams 2008), studied the utility function involved (Wozny, Beierholm & Shams 2010) and showed that subjects’ internal model which gives rise to the causal inference is robust and independent of the details of the stimuli (Beierholm et al. 2009).

Central to these studies is the idea that perception implicitly relies on an internal model of the world, i.e. a set of assumptions (e.g. priors) that can be used to infer the properties of the outside world (for a review see (Shams & Beierholm 2010)).

Reinforcement learning

Optimal behaviour requires not only inferring properties of the environment, but deciding on an optimal policy for how to act based on the environment. These are ideas that are encapsulated by reinforcement learning, a series of models that dependent on the model complexity can specify the optimal choice in a given situation. In a set of experiments we examined the use of such optimal (or near-optimal) models in human decision making.

One study looked at the hypothesized dual systems model of multiple controllers (reinforcement learning models) for decision making. We modeled a habitual and a cognitive system as respectively an associative reinforcement learner and a statistically efficient Bayesian inference algorithm. Human subjects were scanned in a 3T MRI scanner while performing a task explicitly designed to engage these two systems simultaneously.

Analysis of fMRI results showed that BOLD activity in parts of ventro-medial pre-frontal cortex encoded the expected value from both of these systems, indicating that the systems were indeed active simultaneously. This provides clear neurobiological evidence for the dual systems theory of decision making (Beierholm et al., 2011).

Another example of the study of reinforcement learning and decision making is based on the development of a ‘vigor theory’ that specifies that the optimal work rate of an organism should be given by the reward rate. If the reward rate is high sleuthing is costly in terms of lost opportunity, while a low reward rate encourages less motivation and vigor as the opportunity cost is lower. Due to the links with reinforcement learning prediction error it has furthermore been suggested that such a ‘vigor signal’ could be encoded in the brain by tonic Dopamine. We have provided strong evidence for these ideas through psychophysics (Guitart-Masip, Beierholm et al. 2011) and actual pharmacological manipulation of the Dopaminergic system (Beierholm et al. 2013).

In a different experiment we developed a task that required optimal inference over complex stimulus features and optimal choices for combining the inferred properties of the stimuli. We discovered that humans were able to integrate over the uncertain features, while using fMRI we found that activity in ventro-medial prefrontal cortex correlated with the expected reward from an optimal model. Further we found that the better the model explained the variance in a subject’s behavioral data, the better the model was able to explain the variance in the BOLD activity in this same area, confirming the link in variability across subjects with brain activation (Wunderlich, Beierholm et al., 2011). This shows that humans are indeed capable of optimal integration in a reward learning task, combining concepts from both Bayesian inference and reinforcement learning.