Modeling Inductive Biases in Reinforcement Learning

Workshop at RLDM, July 10 2019, Montreal

Call for Abstracts

We invite abstract submissions for a 10-20 minute oral presentation at the workshop. Abstracts should be 1-4 pages long in the RLDM format and not anonymous. They can be emailed to rldm.mibrl@gmail.com.

Submission Deadline: May 29th, 23:59 AoE

Notifications: June 12th

Summary

A remarkable aspect of human learning is that we are able to apply knowledge from the past to help us solve new problems. Often this application of past experience is described as using “inductive priors”[10]. For example, we excel at zero-shot or few-shot learning new tasks by leveraging similar learned skills and observations acquired in similar contexts. We even possess the ability to detect when those priors are inaccurate and when to adapt or ignore them. This generalization from past experience to new situations allow us to selec one generalization over another, outside of strict consistency with known observations [5].

Some examples of injecting such biases in machine learning algorithms include relational networks [1] for object-oriented interactions, or convolutional filters for pixel observations. There has also been some preliminary recent work attempting to compose simple policies [2] and investigating inductive biases induced by unsupervised meta-learning [7]. However, current methods in reinforcement learning (RL) are still not able to transfer and adapt information as efficiently as humans.

We conjecture that structures and mechanisms for enabling inductive biases in RL can be inspired by cognitive science advances [8,9]. This workshop would take advantage of the wide span of communities attending RLDM by bringing together RL and cognitive scientists into sharing challenges, models and ideas toward the development of algorithms with improved knowledge transfer capabilities.

One of the questions to discuss is how to incorporate inductive biases into our models to include our knowledge of the environment, but without introducing harmful biases that prevent learning optimal policies. For example, while inductive biases can often speed learning, they also have been shown to give rise to suboptimal consequences in humans by encouraging premature termination of exploration [6]. Thus, determining when and how strongly to apply a inductive prior remains an important and unsolved problem.

Additional topics include how to implement efficient transfer, prevent catastrophic forgetting, learn re-usable representations (e.g. hierarchical, options), or leverage imitation and observational information.

As proper knowledge transfer may improve sample efficiency [3], we’d also like to discuss what these inductive biases and methods would look like for specific applications where gathering data is expensive. This includes but is not limited to: natural language processing, healthcare, and robotics.


References:

  1. Battaglia et al. Relational inductive biases, deep learning, and graph networks. Arxiv preprint 1806.01261, 2018.
  2. Lee and Sun et al. Composing Complex Skills by Learning Transition Policies with Proximity Reward Induction. ICLR 2019.
  3. Yu, Y.. Towards Sample Efficient Reinforcement Learning. IJCAI 2018.
  4. Nachum, O., Gu, S., Lee, H., & Levine, S. Data-Efficient Hierarchical Reinforcement Learning. NeurIPS 2018.
  5. Mitchell, T. M. The need for biases in learning generalizations. New Jersey: Department of Computer Science, Laboratory for Computer Science Research, Rutgers Univ (pp. 184-191), 1980.
  6. Rich, A. S., & Gureckis, T. M. The limits of learning: Exploration, generalization, and the development of learning traps. Journal of Experimental Psychology: General, 147(11), pp. 1553-1570, 2018.
  7. Gupta, A., Eysenbach, B., Finn, C., & Levine, S. Unsupervised Meta-Learning for Reinforcement Learning. ArXiv preprint 1806.04640, 2018.
  8. Lake, B. M., Lawrence, N. D., & Tenenbaum, J. B. The emergence of organizing structure in conceptual representation. Cognitive science, 2018.
  9. Gershman, S. J. The successor representation: its computational logic and neural substrates. Journal of Neuroscience, 38(33), pp. 7193-7200, 2018.
  10. Dubey, R., Agrawal, P., Pathak, D., Griffiths, T., & Efros, A. Investigating Human Priors for Playing Video Games. ICML 2018.

Speakers

Schedule

TBA