Workshop on biological AND artificial reinforcement learning

Click here to access the virtual workshop

NeurIPS - 12 December 2020


Reinforcement learning (RL) algorithms learn through rewards and a process of trial-and-error. This approach is strongly inspired by the study of animal behaviour and has led to outstanding achievements.

However, artificial agents still struggle with a number of difficulties, such as learning in changing environments and over longer timescales, states abstractions, generalizing and transferring knowledge. Biological agents, on the other hand, excel at these tasks.

The first edition of our workshop last year brought together leading and emerging researchers from Neuroscience, Psychology and Machine Learning to share how neural and cognitive mechanisms can provide insights for RL research and how machine learning advances can further our understanding of brain and behaviour. This year, we want to build on the success of our previous workshop, by expanding on the challenges that emerged and extending to novel perspectives. The problem of state and action representation and abstraction emerged quite strongly last year, so this year's program aims to add new perspectives like hierarchical reinforcement learning, structure learning and their biological underpinnings. Additionally, we will address learning over long timescales, such as lifelong learning or continual learning, by including views from synaptic plasticity and developmental neuroscience.

We are hoping to inspire and further develop connections between biological and artificial reinforcement learning by bringing together experts from all sides and encourage discussions that could help foster novel solutions for both communities.


1. Inductive biases: Are there any key priors/biases (e.g. hierarchical structure of behaviour) grounded in experimental findings of human/animal studies that could potentially inform the design of artificial agents? Or should we explicitly avoid doing so and rely instead on building in the least amount of priors hoping to arrive at more flexible and perhaps better solutions?

2. Sample efficient learning: What can we learn from human/animal learning to arrive at more sample efficient agents? Are there built-in inductive biases (e.g. knowledge of 3D world, objects, physics) akin to the core knowledge system identified by Spelke et al. (2007) that could guide agents to require less interactions with the environment? What about model-based learning, or the spectrum from model-based to model-free that is observed in the neuroscience literature?

3. Representations: What kind of representations would facilitate RL in both animals/humans and agents? Could we identify these in agents first, such that we could probe for their signatures in the brain later? Alternatively, which evidence from animal studies (e.g. existence of place cells) could inform and constrain the kind of representations suitable for generalization and lifelong learning in agents?

4. Intrinsic reward signals: How do motivation and boredom contribute to learning in biological agents? Can we get inspirations from what makes humans/animals explore and interact with their environment (e.g. throughout development) to come up with novel intrinsic task agnostic reward signals that could facilitate learning in sparse or no-reward and lifelong learning settings?

5. Hierarchical RL: What is the role of temporally extended behaviour in learning? Can evidence from human learning (e.g. motor synergies) be inspirational for skill learning and hierarchical RL in artificial agents?

6. Adaptive behaviour: What are the biological, cognitive and computational mechanisms in the brain that allow learning in changing environments? Could they improve RL algorithms in the context of continual learning, lifelong learning or meta-learning? How do they vary through out development or across individuals?

Call for papers

We invite you to submit papers (up to 6 pages, excluding references and appendix) in the NeurIPS 2020 format. The focus of the work should relate to biological and/or artificial reinforcement learning. The review process will be double-blind and accepted submissions will be presented as virtual talks or posters. There will be no proceedings for this workshop, however, authors can opt to have their abstracts/papers posted on the workshop website.

In line with the guidelines defined by the NeurIPS organising committee, we can only accept work that is not published in the main NeurIPS conference. We also welcome published work from other non-machine learning focused venues, particularly work that has previously appeared in Neuroscience or Cognitive Science venues such as Cosyne, RLDM, CogSci and CCN.

Please submit your papers via the following link:

For any enquiries please reach out to us at

Important Dates

Wednesday October 7th Paper Submission Deadline

Wednesday Nov 4th Paper Acceptance Notification

December 12th Workshop at NeurIPS

Speakers & Panelists




















We are hosting a panel discussion with our speakers with the help of our panelist, Grace Lindsay who will moderate the discussion. We are accepting questions from the community.

Schedule (time in GMT+0)

1230 - 1245 Opening Remarks

1245 - 1330 Invited Talk #1 Shakir Mohamed - Pain and Machine Learning [Video]

1335 - 1415 Invited Talk #2 Claudia Clopath - Continual learning with different timescales (This talk will be live and will not be recorded.)

1415 - 1430 Contributed Talk #1 - Learning multi-dimensional rules with probabilistic feedback via value-based serial hypothesis testing [Video]

1430 - 1445 Contributed Talk #2 - Evaluating Agents Without Rewards [Video]

1445 - 1500 Coffee Break

1500 - 1545 Invited Talk #3 Kim Stachenfeld - Structure Learning and the Hippocampal-Entorhinal Circuit [Video, Slides]

1545 - 1630 Invited Talk #4 George Konidaris - Signal to Symbol (via Skills) [Video]

1640 - 1800 Panel Discussions

1800 - 2000 Break & Posters Session @ Gather.Town (Main)

2000 - 2045 Invited Talk #5 Ishita Dasgupta - Embedding structure in data: Progress and challenges for the meta-learning approach [Video, Slides]

2045 - 2130 Invited Talk #6 Catherine Hartley - Developmental tuning of action selection [Video, Slides]

2130 - 2145 Coffee Break

2145 - 2200 Contributed Talk #3 - Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning [Video]

2200 - 2245 Invited Talk #7 Yael Niv - Latent causes, prediction errors and the organization of memory [Video]

2245 - 2300 Closing remarks

2300 - 2345 Social & Posters Session @ Gather.Town


Program Committee

  • Julie Lee

  • Angela Radulescu

  • Maria Eckstein

  • Ankur Handa

  • Weinan Sun

  • Loic Matthey

  • Christos Kaplanis

  • Annik Yalnizyan-Carson

  • Lotte Weerts

  • Olga Lositsky

  • Matthew Schlegel

  • ‪Kai Arulkumaran

  • Joshua Achiam

  • Daniel McNamee

  • Emma Roscow

  • Emma Krause

  • Minryung Song

  • Baihan Lin

Accepted papers

(Poster-ID, Title)
  1. Learning multi-dimensional rules with probabilistic feedback via value-based serial hypothesis testing. Mingyu Song, Yael Niv, Mingbo Cai

  2. Language Inference for Reward Learning. Xiang Fu, Tao Chen, Pulkit Agrawal, Tommi Jaakkola

  3. Action and Perception as Divergence Minimization. Danijar Hafner, Pedro Ortega, Jimmy Ba, Thomas Parr, Karl Friston, Nicolas Heess

  4. Human versus Machine Attention in Deep Reinforcement Learning Tasks. Ruohan Zhang, Bo Liu, Yifeng Zhu, Sihang Guo, Mary Hayhoe, Dana Ballard, Peter Stone

  5. Randomized Value Functions via Posterior State-Abstraction Sampling. Dilip Arumugam, Benjamin Van Roy

  6. Energy-Based Models for Continual Learning. Shuang Li, Yilun Du, Gido M van de Ven, Antonio Torralba, Igor Mordatch

  7. Pain and Machine Learning. Shakir Mohamed, Daniel Ott

  8. Rules warp decision-making. Becket Ebitz, Jiaxin Tu, Benjamin Hayden

  9. Self-Activating Neural Ensembles for Continual Reinforcement Learning. Samantha N Powers, Abhinav Gupta

  10. A Biologically-Inspired Dual Stream World Model. Arthur W Juliani, Margaret E Sereno

  11. Asymmetric and adaptive reward coding arises from normalized reinforcement learning. Kenway Louie

  12. Prediction problems inspired by animal learning. Banafsheh Rafiee, Sina Ghiassian, Richard S Sutton, Elliot A Ludvig, Adam White

  13. Self-Attention as a Generalization of Attention-Weighted Reinforcement Learning. Aurelio Cortese, Lennart Bramlage

  14. Evaluating Agents Without Rewards. Brendon Matusch, Jimmy Ba, Danijar Hafner

  15. Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning. Rishabh Agarwal, Marlos C. Machado, Pablo Samuel Castro, Marc G. Bellemare

  16. A localist network explaining motivated and goal-directed planning: GOLSA. Justin M Fine, Joshua Brown

  17. Escaping Stochastic Traps with Aleatoric Mapping Agents. Augustine Mavor-Parker, Kimberly Young, Caswell Barry, Lewis Griffin

  18. Diversity of discounting horizons explains ramping diversity in dopaminergic neurons. Paul Masset, HyungGoo Kim, Athar Malik, Pol Bech, Naoshige Uchida

  19. Adversarial Intrinsic Motivation for Faster Goal-Conditioned Reinforcement Learning. Ishan P Durugkar, Peter Stone, Scott Niekum

  20. An Information-Theoretic Perspective on Credit Assignment in Reinforcement Learning. Dilip Arumugam, Peter Henderson, Pierre-Luc Bacon

  21. Urgency as the opportunity cost of commitment in biological and machine decision-making. Maximilian Puelma Touzel, Paul Cisek, Guillaume Lajoie

  22. Mimicking mammalian navigation in Watermaze using brain-inspired representations. Mandana Samiei, Arna Ghosh, Blake Richards

  23. Lambda Successor Return Error. Anthony GX-Chen, Veronica Chelu, Blake Richards, Joelle Pineau