Recent advances in movement neuroscience suggest that sensorimotor control can be considered as a continuous decision-making process in complex environments in which uncertainty and task variability play a key role. Leading theories of motor control assume that the motor system learns probabilistic models and that motor behavior can be explained as the optimization of payoff or cost criteria under the expectation of these models. Here we discuss how the motor system exploits task variability to build up efficient models through structural learning and compare human behavior to Bayes optimal models. We discuss in how far structural learning and abstraction can be considered as a consequence of bounded rationality and how bounded rational models may explain such meta-learning behavior.
TBA
Understanding animal decision-making has been a fundamental problem in neuroscience. Many studies analyze actions that represent decision-making in behavioral tasks, in which rewards are artificially designed with specific objectives. However, it is impossible to extend this artificially designed experiment to a natural environment, because in a natural environment, the rewards for freely-behaving animals cannot be clearly defined. To this end, we must reverse the current paradigm so that rewards are identified from behavioral data. In this study, we developed a new reverse-engineering method (inverse reinforcement learning) that can estimate reward-based representation of the behavioral strategy from time-series behavioral data. As a particular target, we focused on thermotactic behavior in C. elegans. Using this method, we successfully decoded the thermotactic strategy, which comprised mixture of two strategies. The one is a strategy that the worms use both information of absolute and temporal derivative of temperature to efficiently reach to specific temperature. The other is a strategy that the worms track along a constant temperature. We further applied our method to the starved worms and found that the worms avoid the starved temperature by using information only of absolute temperature but not of its temporal derivative. In this way, our method is able to clarify how the worms process thermosensory state in their thermotactic strategy. Thus, this study presents and validates a novel approach that should propel the development of new, more effective experiments to identify behavioral strategies and decision-making in animals.
Reference: Yamaguchi, Honda* (2018) PLoS Comp Biol 14(5): e1006122.
Neurophysiological studies on early visual cortices revealed that an initial feedforward-sweep of neural response depends on stimulus features whereas perceptual effect such as awareness and attention is represented as modulation of the late component (e.g., ~100 ms after the stimulus onset). The delayed modulation is presumably mediated by feedback connections from higher brain regions. Psychophysical experiments on humans using visual masking or transcranial magnetic stimulation showed that selective disruption of the late component vanishes conscious experiences of the stimulus. Here I provide a unified computational and statistical view on the modulation of sensory representation by internal dynamics in the brain, which provides a way to quantify the perceptual capacity of neural dynamics.
A key computation is the gain modulation that represents integration of multiples signals by nonlinear devices (neurons). The gain modulation is ubiquitously observed in nervous systems as a mechanism to adapt neurons’ nonlinear response functions to stimulus distributions. It will be shown that the Bayesian view of the brain provides a statistical paradigm for the gain modulation. Moreover, the delayed gain-modulation of the stimulus response via recurrent feedback connections is modeled as a dynamic process of the Bayesian inference that combines the observation and top-down prior with time-delay. Interestingly, it will be shown that this process becomes a mathematical analogue of a heat engine in thermodynamics [1]. This view provides us to quantify the amount of the delayed gain modulation and its efficiency in terms of entropy changes of the neural activity. I will show how we can quantify the perceptual capacity from neural spiking data using the state-space Ising model of neural populations, which we have been developing in the past 10 years [2,3].
1) Shimazaki (2015) Neurons as an Information-theoretic Engine. arXiv:1512.07855 (published as a book chapter)
2) Shimazaki, Amari, Brown, Gruen (2012) PLoS Comp Biol 8(3): e1002385
3) Donner, Obermeyer, Shimazaki (2017) PLoS Comp Biol 13(1): e1005309
The hallmark symptom of obsessive-compulsive disorder (OCD) is a deficit in the flexible switching of behavioral planning. Persistent thoughts inducing pessimistic and repetitive decisions are often symptoms of OCD. We recently identified a causal source of the persistent states by microstimulating the striatum of macaque monkeys performing a task by which we could quantitatively estimate their subjective pessimistic states using their choices to accept or reject conflicting offers. We found that this microstimulation induced irrationally repetitive choices with negative evaluations (Amemori et al., Neuron, 2018). But, how does the striatal dysfunction induce aberrant repetition of negative thoughts? To understand the mechanism of OCD-like states, we introduced a computational model of the basal ganglia circuitry that includes striosome-matrix compartments and the direct and indirect pathways (Amemori et al., Frontiers in Human Neurosci. 2011). To achieve flexible switching and learning in changing environments, we adopted modular reinforcement learning (RL) architecture that has been proposed previously (Wolpert and Kawato, 1998; Haruno et al., 2001; Doya et al., 2002). The modular RL could flexibly adapt to the changing environment by switching among previously acquired behavioral sets and learn to add new behavioral elements when the sets do not fit. With simple assumptions, our model suggests that while the direct pathway may promote actions based on striatal action values, the indirect pathway may act as a gating network that facilitates or suppresses behavioral modules based on striatal responsibility signals. In the model, the striatal cholinergic system could represent the responsibility signal, and the dysfunction in representing responsibility signal produced repetitive symptoms, as observed in the OCD. Our model could thus derive the hypothesis that the striatal dopamine and acetylcholine imbalance could underlie such OCD-like symptoms.
Little is known about the difference in neuronal connectivity across brain regions that are processing different kinds of information. With the recent increase in parallel high channel count extracellular recordings, it might be possible to infer the inter-neuronal or synaptic connectivity. We applied a generalized linear model (GLM) to identify pairs of neurons with millisecond differences in spike timing to determine the pairs that were likely monosynaptically connected. Our method estimates connections between neurons in units of postsynaptic potentials and the amount of spike recordings needed to verify connections. We optimized the performance of inference by counting the estimation errors using synthetic data. Our estimation method is superior to other established methods in correctly estimating connectivity. By applying our method to rat hippocampal data, we confirmed that the types of estimated connections match the results inferred from other physiological cues. With our method, neuroscientists may obtain a connectivity diagram of neurons from a set of spike trains they recorded from animals.