"Reward Prediction Error Signals are Meta-Representational."
Although there has been considerable debate about the existence of meta-representational
capacities in non-human animals and their scope in humans, the well-confirmed temporal
difference reinforcement learning models of reward-guided decision making have been
largely overlooked. This paper argues that the reward prediction error signals which are
postulated by temporal difference models and have been discovered empirically through
single unit recording and neuroimaging do have meta-representational contents.