Bounded rationality, that is decision-making with limited information-processing resources, is widely regarded as an important open problem in artificial intelligence, reinforcement learning, computational neuroscience and economics. Here we discuss a theory of bounded rationality based on information-theoretic principles inspired by statistical physics and thermodynamics. In particular, we discuss the free energy functional as an objective function for characterizing information-processing and decision-making. Applications of this framework are very general unifying a number of existing approaches including decision-making with entropy constraints and Bayes-optimal decision-making, decision-making under model uncertainty and missing information, the coupling between action and perception and the formation of decision hierarchies and abstractions. We discuss experimental tests of the theoretical predictions in human sensorimotor control.
Continuation from the morning lecture
Reward and aversive learning is critical for animals to survive. The nucleus accumbens (NAc), the ventral part of the striatum, in the basal ganglia circuit is an important neural substrate in reward and aversive learning. In the basal ganglia circuit, the inputs of the striatum/NAc are transmitted through two parallel direct and indirect pathways and controlled by dopamine transmitter. However, the role of the parallel pathways remains unknown. We have developed transgenic mouse models to dissect neural network mechanisms in cognitive learning. Using reversible neurotransmission blocking (RNB), we developed D-RNB and I-RNB mouse models in which transmission-blocking tetanus toxin was specially expressed in the direct striatonigral or the indirect striatopallidal pathway and, in turn, blocked each pathway in a doxycycline-dependent manner. We have revealed the distinct role of the two NAc pathways: the direct pathway critical for reward-based learning and the indirect pathway for aversive learning. We also addressed the regulatory mechanisms of the two pathways, suggesting that the dynamic shift of pathway-specific neural plasticity via dopamine receptors and subcellular signaling is essential for reward and aversive learning.
Our brain processes the incoming sensory inputs incessantly, and make predictions and generalizations of the environments to adapt our behavior. This process is often referred to as “predictive coding”. Variety of mathematical models of the predictive coding of the brain have been proposed, and some of them have become basis of the deep learning models in terms of the machine learning. This deep learning theory is now considered to be one of the most powerful tools with the analysis of big data. However, even though we now have good mathematical models, there is a lot to be found out about what really happening in our brain. Especially, with whole-brain level, we hardly know anything about the dynamical process of the predictive cording.
To investigate cortical-wide information processing, we developed a whole-cortical ECoG array for the common marmoset, a small non-human primate. The array provides an opportunity to capture global cortical information processing with high resolutions at a sub-millisecond order in time and millimeter order in space. In this study, we investigated large-scale cortical information dynamics for incoming auditory stimuli. We found that the early positive components of auditory evoked potentials (AEPs) initially appeared in primary auditory area, and then, the activity moved to higher auditory areas. In parallel, the early negative components of AEP propagate from frontal to parietal cortices. Furthermore, we quantified the prediction and prediction error of auditory stimuli by using Hierarchical Gaussian Filter (Stefanics et al., 2018), and found that ECoGs in prefrontal areas were significantly correlated with the prediction, while ECoGs in auditory areas were correlated with the prediction error. These results suggested cortical-wide predictive coding of auditory information in the primate neocortices.
For efficient and correct information processing in cerebral cortex, neural circuit dynamics must be spatially and temporally regulated with great precision. Medial prefrontal cortex (mPFC) of rodents has been shown important for various types of learning and memory, including fear memory, and related to various psychiatric diseases. However, it still remains unclear how population of neurons in this region enables the information processing depending on learning states, of which major problems are the complexity and heterogeneity of the prefrontal networks. Little is known especially about the mechanism underlying fear memory. Here we investigate this by chronic two-photon Ca2+ imaging from populations of neurons in mouse mPFC in vivo, which allows us to 1) record activities simultaneously from large number of neurons at the single cell resolution with high temporal resolution, and 2) investigate changes of neuronal responses depending on the learning states. We focus on the change in population responses of mPFC neurons for the fear memory by developing a new device to perform and test Pavlovian fear conditioning under the microscope. While many of the previous imaging studies for the mPFC relied on the invasive method, our system can minimize such damage by introducing a microprism-based observation method. We further demonstrate population coding underlying learning process and memory recall.
Knowing the normative purpose of adaptation or optimization of a neural network is essential for understanding the intelligence of biological organisms. We identified a class of biologically plausible cost functions for a canonical neural network – where the same cost function is minimized by both neural activity and plasticity. According to the complete class theorem, when an environment is provided as a discrete state space, the cost function for neural networks can be cast as a variational bound of model evidence (or free energy) under an implicit generative model. This equivalence suggests that any neural network minimizing its cost function implicitly performs variational Bayesian inference, indicating that variational free energy minimization (a.k.a. the free-energy principle) is an apt principle for a canonical neural network. We showed that in vitro neural networks that receive input stimuli generated from hidden sources perform causal inference or source separation through activity-dependent synaptic plasticity – by minimizing variational free energy – as predicted by the theory. These results highlight that our approach enables to characterize the aim and function of a given neuronal network in terms of generative model and prior beliefs from empirical data. This will be useful to formulate the neuronal mechanism underlying inference and learning, leading to a deeper understanding of the intelligence of biological organisms.
Our retinas have high acuity only in the fovea, meaning that we have to construct our visual scene by constantly moving our eyes. Thus, vision is not a passive formation of representation, but active sampling of visual information by agent’s action. Sensorimotor enactivism (SME) extends this idea and proposes that seeing is an exploratory activity mediated by the agent’s mastery of sensorimotor contingencies (i.e., by practical grasp of the way sensory stimulation varies as the perceiver moves). However, SME lacks formal, empirically testable theories or models. Here we argue that Free-energy principle (FEP) may serve as a computational model of SME. In FEP, agents interacting environments with their sensors minimize the variational free-energy (VFE) defined by the approximated inference q(x) about the hidden states (x) of the environment and the generative model, the joint probability distribution p(x,s) that describes how the agent’s sensor (s) interact with the environment (x). In this framework, representationalist idea such as making internal representation or unconscious inference can be interpreted as having the approximated inference q(x). On the other hand, SME can be interpreted as having the generative model p(x,s). Since minimization of VFE requires both q(x) and p(x,s), adaptive behavior such as vision may require continuous matching between q(x) and p(x,s). Thus, FEP may help to integrate representationalistic and enactive views to make the theory of vision.