Workshop on biological AND artificial reinforcement learning
Click here to access the virtual workshop
NeurIPS - 12 December 2020
Click here to access the virtual workshop
NeurIPS - 12 December 2020
Reinforcement learning (RL) algorithms learn through rewards and a process of trial-and-error. This approach is strongly inspired by the study of animal behaviour and has led to outstanding achievements.
However, artificial agents still struggle with a number of difficulties, such as learning in changing environments and over longer timescales, states abstractions, generalizing and transferring knowledge. Biological agents, on the other hand, excel at these tasks.
The first edition of our workshop last year brought together leading and emerging researchers from Neuroscience, Psychology and Machine Learning to share how neural and cognitive mechanisms can provide insights for RL research and how machine learning advances can further our understanding of brain and behaviour. This year, we want to build on the success of our previous workshop, by expanding on the challenges that emerged and extending to novel perspectives. The problem of state and action representation and abstraction emerged quite strongly last year, so this year's program aims to add new perspectives like hierarchical reinforcement learning, structure learning and their biological underpinnings. Additionally, we will address learning over long timescales, such as lifelong learning or continual learning, by including views from synaptic plasticity and developmental neuroscience.
We are hoping to inspire and further develop connections between biological and artificial reinforcement learning by bringing together experts from all sides and encourage discussions that could help foster novel solutions for both communities.
1. Inductive biases: Are there any key priors/biases (e.g. hierarchical structure of behaviour) grounded in experimental findings of human/animal studies that could potentially inform the design of artificial agents? Or should we explicitly avoid doing so and rely instead on building in the least amount of priors hoping to arrive at more flexible and perhaps better solutions?
2. Sample efficient learning: What can we learn from human/animal learning to arrive at more sample efficient agents? Are there built-in inductive biases (e.g. knowledge of 3D world, objects, physics) akin to the core knowledge system identified by Spelke et al. (2007) that could guide agents to require less interactions with the environment? What about model-based learning, or the spectrum from model-based to model-free that is observed in the neuroscience literature?
3. Representations: What kind of representations would facilitate RL in both animals/humans and agents? Could we identify these in agents first, such that we could probe for their signatures in the brain later? Alternatively, which evidence from animal studies (e.g. existence of place cells) could inform and constrain the kind of representations suitable for generalization and lifelong learning in agents?
4. Intrinsic reward signals: How do motivation and boredom contribute to learning in biological agents? Can we get inspirations from what makes humans/animals explore and interact with their environment (e.g. throughout development) to come up with novel intrinsic task agnostic reward signals that could facilitate learning in sparse or no-reward and lifelong learning settings?
5. Hierarchical RL: What is the role of temporally extended behaviour in learning? Can evidence from human learning (e.g. motor synergies) be inspirational for skill learning and hierarchical RL in artificial agents?
6. Adaptive behaviour: What are the biological, cognitive and computational mechanisms in the brain that allow learning in changing environments? Could they improve RL algorithms in the context of continual learning, lifelong learning or meta-learning? How do they vary through out development or across individuals?
We invite you to submit papers (up to 6 pages, excluding references and appendix) in the NeurIPS 2020 format. The focus of the work should relate to biological and/or artificial reinforcement learning. The review process will be double-blind and accepted submissions will be presented as virtual talks or posters. There will be no proceedings for this workshop, however, authors can opt to have their abstracts/papers posted on the workshop website.
In line with the guidelines defined by the NeurIPS organising committee, we can only accept work that is not published in the main NeurIPS conference. We also welcome published work from other non-machine learning focused venues, particularly work that has previously appeared in Neuroscience or Cognitive Science venues such as Cosyne, RLDM, CogSci and CCN.
Please submit your papers via the following link: https://cmt3.research.microsoft.com/BARL2020/
For any enquiries please reach out to us at BiologicalArtificialRL@gmail.com
Wednesday October 7th                           Paper Submission Deadline 
Wednesday Nov 4th                                   Paper Acceptance Notification
December 12th Workshop at NeurIPS
1230 - 1245 Opening Remarks
1245 - 1330 Invited Talk #1 Shakir Mohamed - Pain and Machine Learning [Video]
1335 - 1415 Invited Talk #2 Claudia Clopath - Continual learning with different timescales (This talk will be live and will not be recorded.)
1415 - 1430 Contributed Talk #1 - Learning multi-dimensional rules with probabilistic feedback via value-based serial hypothesis testing [Video]
1430 - 1445 Contributed Talk #2 - Evaluating Agents Without Rewards [Video]
1445 - 1500 Coffee Break
1500 - 1545 Invited Talk #3 Kim Stachenfeld - Structure Learning and the Hippocampal-Entorhinal Circuit [Video, Slides]
1545 - 1630 Invited Talk #4 George Konidaris - Signal to Symbol (via Skills) [Video]
1640 - 1800 Panel Discussions
1800 - 2000 Break & Posters Session @ Gather.Town (Main)
2000 - 2045 Invited Talk #5 Ishita Dasgupta - Embedding structure in data: Progress and challenges for the meta-learning approach [Video, Slides]
2045 - 2130 Invited Talk #6 Catherine Hartley - Developmental tuning of action selection [Video, Slides]
2130 - 2145 Coffee Break
2145 - 2200 Contributed Talk #3 - Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning [Video]
2200 - 2245 Invited Talk #7 Yael Niv - Latent causes, prediction errors and the organization of memory [Video]
2245 - 2300 Closing remarks
2300 - 2345 Social & Posters Session @ Gather.Town
Julie Lee
Angela Radulescu
Maria Eckstein
Ankur Handa
Weinan Sun
Loic Matthey
Christos Kaplanis
Annik Yalnizyan-Carson
Lotte Weerts
Olga Lositsky
Matthew Schlegel
Kai Arulkumaran
Joshua Achiam
Daniel McNamee
Emma Roscow
Emma Krause
Minryung Song
Baihan Lin
Learning multi-dimensional rules with probabilistic feedback via value-based serial hypothesis testing. Mingyu Song, Yael Niv, Mingbo Cai
Language Inference for Reward Learning. Xiang Fu, Tao Chen, Pulkit Agrawal, Tommi Jaakkola
Action and Perception as Divergence Minimization. Danijar Hafner, Pedro Ortega, Jimmy Ba, Thomas Parr, Karl Friston, Nicolas Heess
Human versus Machine Attention in Deep Reinforcement Learning Tasks. Ruohan Zhang, Bo Liu, Yifeng Zhu, Sihang Guo, Mary Hayhoe, Dana Ballard, Peter Stone
Randomized Value Functions via Posterior State-Abstraction Sampling. Dilip Arumugam, Benjamin Van Roy
Energy-Based Models for Continual Learning. Shuang Li, Yilun Du, Gido M van de Ven, Antonio Torralba, Igor Mordatch
Pain and Machine Learning. Shakir Mohamed, Daniel Ott
Rules warp decision-making. Becket Ebitz, Jiaxin Tu, Benjamin Hayden
Self-Activating Neural Ensembles for Continual Reinforcement Learning. Samantha N Powers, Abhinav Gupta
A Biologically-Inspired Dual Stream World Model. Arthur W Juliani, Margaret E Sereno
Asymmetric and adaptive reward coding arises from normalized reinforcement learning. Kenway Louie
Prediction problems inspired by animal learning. Banafsheh Rafiee, Sina Ghiassian, Richard S Sutton, Elliot A Ludvig, Adam White
Self-Attention as a Generalization of Attention-Weighted Reinforcement Learning. Aurelio Cortese, Lennart Bramlage
Evaluating Agents Without Rewards. Brendon Matusch, Jimmy Ba, Danijar Hafner
Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning. Rishabh Agarwal, Marlos C. Machado, Pablo Samuel Castro, Marc G. Bellemare
A localist network explaining motivated and goal-directed planning: GOLSA. Justin M Fine, Joshua Brown
Escaping Stochastic Traps with Aleatoric Mapping Agents. Augustine Mavor-Parker, Kimberly Young, Caswell Barry, Lewis Griffin
Diversity of discounting horizons explains ramping diversity in dopaminergic neurons. Paul Masset, HyungGoo Kim, Athar Malik, Pol Bech, Naoshige Uchida
Adversarial Intrinsic Motivation for Faster Goal-Conditioned Reinforcement Learning. Ishan P Durugkar, Peter Stone, Scott Niekum
An Information-Theoretic Perspective on Credit Assignment in Reinforcement Learning. Dilip Arumugam, Peter Henderson, Pierre-Luc Bacon
Urgency as the opportunity cost of commitment in biological and machine decision-making. Maximilian Puelma Touzel, Paul Cisek, Guillaume Lajoie
Mimicking mammalian navigation in Watermaze using brain-inspired representations. Mandana Samiei, Arna Ghosh, Blake Richards
Lambda Successor Return Error. Anthony GX-Chen, Veronica Chelu, Blake Richards, Joelle Pineau