Reward Trends

During foraging, knowing when to leave and move on to another environment (e.g. food patch) is essential. The most important quantity for this is the rate of reward (i.e. reward per time unit). By tracking instantaneous reward rates an agent can make basic foraging decisions. But if they can go further and predict future reward levels (reward trends), they can adapt to more complex environments that humans thrive in. Yet, the neural or cognitive computations for reward trend estimation for environment decisions are largely unknown. 

We identified dorsal anterior cingulate in particular as relevant for time-sensitive patch leaving behaviour as it has ramping activity towards a leaving threshold during such choices.


References: Kolling, Behrens, Wittmann, Boorman, Mars, Rushworth (2016) Value, search, persistence and model updating in anterior cingulate cortex Nature Neuroscience doi: https://doi.org/10.1038/nn.4382.

Wittmann, Kolling, Akaishi, Chau, Brown, Nelissen, Rushworth (2016) Predictive decision making driven by multiple time-linked reward representations in the anterior cingulate cortex. Nature Communication doi: https://doi.org/10.1038/ncomms12327.

At the same time, how dACC represents reward predictions allows inferring reward trends. Specifically, dACC represents reward expectations not as a single value, but as a set of values, each learned based on a different time constants or ‘learning rate’ (e.g. some cells track fast changes in reward, while others are more sensitive to slow fluctuations), which can serve as building blocks for reward trend estimation.

However, we lack a cognitive, computational or neural model that explain the flexible use of different reward trends or their mechanism of learning.

Reference : Meder, Kolling, Verhagen, Wittmann, Madsen, Hulme, Behrens, Rushworth (2017) Simultaneous representation of a spectrum of dynamically changing value estimates during decision making, Nature Communication doi: https://doi.org/10.1038/s41467-017-02169-w.