This piece, by Armin Bazarjani, was published on 11/19/24. The original text, by Piray and Daw, was published by Nature Communications on 11/15/21.
This work by Piray and Daw is intimately related to the SR model of a cognitive map. They built on Todorov's work in 2006, when he constructed a class of Markov Decision Processes (MDPs) to greatly simplify the reinforcement learning problem into a linearly solvable problem. Todorov introduced a “default policy” and an associated control cost term for deviating from it. It turns out this formulation can also give us some interesting insights into brain function.
We won’t go into it here, but because the successor representation is constructed with respect to the agent’s policy (which actions it takes), it is biased towards that policy. The work by Piray & Daw introduces a “default representation” (DR), which is constructed with respect to an unbiased default policy that assigns an equal probability to every available action in every state.
The DR has been proven to be a great cognitive map model. Not only does it find the optimal policy or set of actions to solve a task, but it can also replan much more efficiently than any other model when the environment changes (e.g., by adding a barrier).
Additionally, the DR can explain various behavioral biases and inflexibilities because its construction is based on the default policy. For example, in cases of cognitive control, the model naturally explains why control-demanding actions (those that deviate from defaults) feel effortful. With behavioral biases (e.g., habits), the model suggests this could be through an “overtrained” default policy. Additionally, it can explain some disorders through maladaptive or biased default policies in certain individuals.
With respect to neuroscience, the DR makes the same predictions as the SR. Namely, it provides a mechanistic explanation for place cells and grid cells. However, the paper also introduced an interesting revelation about border cells as an update term, and their follow-up work (Piray & Daw 2024) provides a compelling narrative for how object-vector cells and grid cells can come together for compositionally constructing cognitive maps.
References
Piray, P., & Daw, N. D. (2021). Linear reinforcement learning in planning, grid fields, and cognitive control. Nature communications, 12(1), 4942.
Piray, P., & Daw, N. D. (2024). Reconciling Flexibility and Efficiency: Medial Entorhinal Cortex Represents a Compositional Cognitive Map. bioRxiv, 2024-05.
Want to submit a piece? Or trying to write a piece and struggling? Check out the guides here!
Thank you for reading. Reminder: Byte Sized is open to everyone! Feel free to submit your piece. Please read the guides first though.
All submissions to berkan@usc.edu with the header “Byte Sized Submission” in Word Doc format please. Thank you!