Modeling Inductive Biases in Reinforcement Learning

Workshop at RLDM, July 10 2019, Montreal

Panel Questions

We invite you to submit questions to our panel here: https://docs.google.com/forms/d/1fZw29-ty5_1cY0cSjuS8Zhdj_CWPBFzBrNh_P1nmE9k

Call for Abstracts

We invite abstract submissions for a 10-20 minute oral presentation at the workshop. Abstracts should be 1-4 pages long in the RLDM format and not anonymous. They can be emailed to rldm.mibrl@gmail.com.

Submission Deadline: May 29th, 23:59 AoE

Notifications: June 12th

Summary

A remarkable aspect of human learning is that we are able to apply knowledge from the past to help us solve new problems. Often this application of past experience is described as using “inductive priors”[10]. For example, we excel at zero-shot or few-shot learning new tasks by leveraging similar learned skills and observations acquired in similar contexts. We even possess the ability to detect when those priors are inaccurate and when to adapt or ignore them. This generalization from past experience to new situations allow us to selec one generalization over another, outside of strict consistency with known observations [5].

Some examples of injecting such biases in machine learning algorithms include relational networks [1] for object-oriented interactions, or convolutional filters for pixel observations. There has also been some preliminary recent work attempting to compose simple policies [2] and investigating inductive biases induced by unsupervised meta-learning [7]. However, current methods in reinforcement learning (RL) are still not able to transfer and adapt information as efficiently as humans.

We conjecture that structures and mechanisms for enabling inductive biases in RL can be inspired by cognitive science advances [8,9]. This workshop would take advantage of the wide span of communities attending RLDM by bringing together RL and cognitive scientists into sharing challenges, models and ideas toward the development of algorithms with improved knowledge transfer capabilities.

One of the questions to discuss is how to incorporate inductive biases into our models to include our knowledge of the environment, but without introducing harmful biases that prevent learning optimal policies. For example, while inductive biases can often speed learning, they also have been shown to give rise to suboptimal consequences in humans by encouraging premature termination of exploration [6]. Thus, determining when and how strongly to apply an inductive prior remains an important and unsolved problem.

Additional topics include how to implement efficient transfer, prevent catastrophic forgetting, learn re-usable representations (e.g. hierarchical, options), or leverage imitation and observational information.

As proper knowledge transfer may improve sample efficiency [3,4], we’d also like to discuss what these inductive biases and methods would look like for specific applications where gathering data is expensive. This includes but is not limited to: natural language processing, healthcare, and robotics.


References:

  1. Battaglia et al. Relational inductive biases, deep learning, and graph networks. Arxiv preprint 1806.01261, 2018.
  2. Lee and Sun et al. Composing Complex Skills by Learning Transition Policies with Proximity Reward Induction. ICLR 2019.
  3. Yu, Y.. Towards Sample Efficient Reinforcement Learning. IJCAI 2018.
  4. Nachum, O., Gu, S., Lee, H., & Levine, S. Data-Efficient Hierarchical Reinforcement Learning. NeurIPS 2018.
  5. Mitchell, T. M. The need for biases in learning generalizations. New Jersey: Department of Computer Science, Laboratory for Computer Science Research, Rutgers Univ (pp. 184-191), 1980.
  6. Rich, A. S., & Gureckis, T. M. The limits of learning: Exploration, generalization, and the development of learning traps. Journal of Experimental Psychology: General, 147(11), pp. 1553-1570, 2018.
  7. Gupta, A., Eysenbach, B., Finn, C., & Levine, S. Unsupervised Meta-Learning for Reinforcement Learning. ArXiv preprint 1806.04640, 2018.
  8. Lake, B. M., Lawrence, N. D., & Tenenbaum, J. B. The emergence of organizing structure in conceptual representation. Cognitive science, 2018.
  9. Gershman, S. J. The successor representation: its computational logic and neural substrates. Journal of Neuroscience, 38(33), pp. 7193-7200, 2018.
  10. Dubey, R., Agrawal, P., Pathak, D., Griffiths, T., & Efros, A. Investigating Human Priors for Playing Video Games. ICML 2018.

Speakers

Schedule

1:00pm - 1:05pm Introduction

1:05pm - 1:45pm Invited talk: Todd Gureckis

1:45pm - 2:00pm Contributed talk: Lucas Lehnert (Model-based Knowledge Representations)

2:00pm - 2:15pm Contributed talk: Julien Roy (Promoting Coordination through Policy Regularization in Multi-Agent Reinforcement Learning)

2:15pm - 2:30pm Contributed talk: Veronica Chelu (Option Discovery by Aiming to Predict)

2:30pm - 3:10pm Invited talk: Anne Collins

3:10pm - 3:25pm Coffee break

3:25pm - 4:05pm Invited talk: Anna Harutyunyan

4:05pm - 5:00pm Panel discussion: Todd Gureckis, Anna Harutyunyan, Marlos Machado, Anne Collins (Moderator: Doina Precup)

Submit questions for the panel here!