Continual Reinforcement Learning

WiML ICML Unworkshop - July 13th 2020

6:35 PM – 7:35 PM GMT

Abstract

Humans have a remarkable ability to continually learn and adapt to new scenarios over the duration of their lifetime (Smith & Gasser, 2005). This ability is referred to as continual learning. Continual learning (CL) is the constant development of increasingly complex behaviors, the process of building complicated skills on top of those already developed (Ring, 1997), while being able to reapply, adapt and generalize it's abilities to new situations. A continual learner can be seen as an autonomous agent with no final task, and, has the following desiderata in the context of learning: i) it can learn context dependent tasks, ii) learns behaviors and skills while solving its tasks, iii) it learns incrementally with no fixed training set, iv) it learns hierarchically i.e. skills learned now can be built upon later as described by (Ring, 1997).

The findings in the cognitive science literature have often served as a stepping stone to make progress towards human-like continual learners. We’d like to discuss how to bridge the gap between continual learning for artificially intelligent (AI) agents and hypotheses about human continual learning. This includes (1) the question of how to generate proper goals, tasks, rewards to incentivize adaptiveness (Florensa et al., 2018): Where do tasks or rewards come from? For example, while hand-coded reward functions can yield proficient policies for AI agents, they limit the agent to a single task and their reward scaling can have a large effect on the agent’s performance (Henderson et al., 2018). (2) how to make decisions in the long-term while factoring in delayed, sparse reward (or the lack thereof) (Samuelson, 1937; Hung et al., 2019) how memory should handle prioritizing recent or critical experiences and assign credit, including but not limited to (3) how to learn and leverage re-usable representations such as state abstractions, options, or skills over the course of our lifetime (Kirkpatrick et al., 2019).

SCHEDULE

Breakout Session 4, 6:35 PM – 7:35 PM GMT

  • Introduction
    • Goals
    • Participation Guidelines
    • Continual Reinforcement Learning: A brief introduction
  • Structured Discussion
    • What are goals/tasks/rewards and where do they come from?
    • What role does attention or memory play in long-term decision making?
    • How to learn and leverage reusable representations over the course of an agent’s lifetime?
  • Breakout Rooms - Group discussions
  • Closing Remarks

Related workshops

ORGANISERS

Khimya Khetarpal

McGill University, Mila Montreal

Rose Wang

Stanford University, Google Brain

Thanks to our facilitator

Arundhati Banerjee

Carnegie Mellon University