Workshop on Multi-Task and Lifelong Reinforcement Learning
Workshop at ICML, 15 June 2019, Long Beach
Significant progress has been made in reinforcement learning, enabling agents to accomplish complex tasks such as Atari games, robotic manipulation, simulated locomotion, and Go. These successes have stemmed from the core reinforcement learning formulation of learning a single policy or value function from scratch. However, reinforcement learning has proven challenging to scale to many practical real world problems due to problems in learning efficiency and objective specification, among many others. Recently, there has been emerging interest and research in leveraging structure and information across multiple reinforcement learning tasks to more efficiently and effectively learn complex behaviors. This includes:
- curriculum and lifelong learning, where the problem requires learning a sequence of tasks, leveraging their shared structure to enable knowledge transfer
- goal-conditioned reinforcement learning techniques that leverage the structure of the provided goal space to learn many tasks significantly faster
- meta-learning methods that aim to learn efficient learning algorithms that can learn new tasks quickly
- hierarchical reinforcement learning, where the reinforcement learning problem might entail a compositions of subgoals or subtasks with shared structure
Multi-task and lifelong reinforcement learning has the potential to alter the paradigm of traditional reinforcement learning, to provide more practical and diverse sources of supervision, while helping overcome many challenges associated with reinforcement learning, such as exploration, sample efficiency and credit assignment. However, the field of multi-task and lifelong reinforcement learning is still young, with many more developments needed in terms of problem formulation, algorithmic and theoretical advances as well as better benchmarking and evaluation.
The focus of this workshop will be on both the algorithmic and theoretical foundations of multi-task and lifelong reinforcement learning as well as the practical challenges associated with building multi-tasking agents and lifelong learning benchmarks. Our goal is to bring together researchers that study different problem domains (such as games, robotics, language, and so forth), different optimization approaches (deep learning, evolutionary algorithms, model-based control, etc.), and different formalisms (as mentioned above) to discuss the frontiers, open problems and meaningful next steps in multi-task and lifelong reinforcement learning.
Confirmed Speakers
Dates
Submission deadline: May 3, 2019 (AOE) [extended from April 28, 2019]
Notifications: May 20, 2019 (AOE) [extended from May 10, 2019]
**Late-breaking submission deadline**: Friday, May 31, 2019 (AOE)
Camera Ready: June 10, 2019 (AOE)
Workshop: June 15, 2019
Call for Papers
UPDATE: We welcome late-breaking submissions, with the formatting guidelines below. Accepted late-breaking works will be presented as posters. These submissions should be made by emailing the submission pdf to the following email address by Friday May 31, AOE: mtlrl@googlegroups.com
The submitted work should be an extended abstract of between 4-8 pages (including references). The submission should be in pdf format and should follow the style guidelines for ICML 2019 (found here). The review process is double-blind and the work should be submitted by Apr 28th, 2019 (Anywhere on Earth) at the latest. The submissions should *not* have been previously published nor have appeared in the ICML main conference . Work currently under submission to another conference is welcome. There will be no formal publication of workshop proceedings. However, the accepted papers will be made available online in the workshop website.
Full (non late-breaking) submissions must be made through Easy chair: https://easychair.org/conferences/?conf=mtlrl2019
Below are example topics that we welcome submissions from:
- curriculum and lifelong learning, where the problem requires learning a sequence of tasks, leveraging their shared structure to enable knowledge transfer
- goal-conditioned reinforcement learning techniques that leverage the structure of the provided goal space to learn many tasks significantly faster
- meta-learning methods that aim to learn efficient learning algorithms that can learn new tasks quickly
- hierarchical reinforcement learning, where the reinforcement learning problem might entail a compositions of subgoals or subtasks with shared structure
Schedule
Subject to change
08:45 – 09:00 Opening remarks
09:00 – 09:25 Invited talk #1: Sergey Levine - Unsupervised Reinforcement Learning and Meta-Learning
09:25 – 09:50 Spotlights (all ~25 papers that don’t have a contributed talk)
09:50 - 10:15 Invited talk #2: Peter Stone - Learning Curricula for Transfer Learning in RL
10:15 – 10:30 Contributed Talks (7 min each + 1 min for questions & transition)
- 10:15 Meta-Learning via Learned Loss Yevgen Chebotar*, Artem Molchanov*, Sarah Bechtle*, Ludovic Righetti, Franziska Meier, Gaurav Sukhatme
- 10:22 MCP: Learning Composable Hierarchical Control with Multiplicative Compositional Policies Xue Bin Peng, Michael Chang, Grace Zhang Pieter Abbeel, Sergey Levine
10:30 – 11:00 Poster session and coffee break
-----------------
11:00 – 11:25 Invited talk #3: Jacob Andreas - Linguistic scaffolds for policy learning
11:25 – 11:50 Invited talk #4: Karol Hausman - Skill Representation and Supervision in Multi-Task Reinforcement Learning
11:50 – 12:20 Contributed talks (7 min each + 1 min for questions & transition)
- 11:50 Which Tasks Should Be Learned Together in Multi-task Learning? Trevor Standley, Amir R. Zamir, Dawn Chen, Leonidas Guibas, Jitendra Malik, Silvio Savarese
- 11:58 Sub-policy Adaptation for Hierarchical Reinforcement Learning Carlos Florensa*, Alexander Li* and Pieter Abbeel
- 12:06 Online Learning for Auxiliary Task Weighting for Reinforcement Learning Xingyu Lin*, Harjatin Singh Baweja*, George Kantor, David Held
- 12:14 Guided Meta-Policy Search Russell Mendonca, Abhishek Gupta, Rosen Kralev, Pieter Abbeel, Sergey Levine, Chelsea Finn
12:20 – 02:00 Poster session and lunch break
-----------------
02:00 – 02:25 Invited talk #5: Martha White - Learning Representations for Continual Learning
02:25 – 02:50 Invited talk #6: Natalia Diaz-Rodriguez - Continual Learning and Robotics: an overview
02:50 – 03:30 Afternoon coffee break and poster session
-----------------
03:30 – 03:55 Invited talk #7: Jeff Clune Towards Solving Catastrophic Forgetting with Neuromodulation & Learning Curricula by Generating Environments
03:55 – 04:15 Contributed talks (7 min each + 1 min for questions & transition)
- 03:55 Online Continual Learning with Maximally Inferred Retrieval Rahaf Aljundi*, Lucas Caccia*, Eugene Belilovsky*, Massimo Caccia*, Min Lin, Laurent Charlin, Tinne Tuytelaars
- 04:05 Skew-Fit: State-Covering Self-Supervised Reinforcement Learning Vitchyr H. Pong *, Murtaza Dalal*, Steven Lin, Ashvin Nair, Shikhar Bahl, Sergey Levine
04:15 – 04:40 Invited talk #8: Nicolas Heess - Talk TBA
04:40 – 05:05 Invited talk #9: Benjamin Rosman - Exploiting Structure For Accelerating Reinforcement Learning
05:05 – 06:00 Panel Discussion
Accepted Papers
- MCP: Learning Composable Hierarchical Control with Multiplicative Compositional Policies Xue Bin Peng, Michael Chang, Grace Zhang Pieter Abbeel, Sergey Levine
- Guided Meta-Policy Search Russell Mendonca, Abhishek Gupta, Rosen Kralev, Pieter Abbeel, Sergey Levine, Chelsea Finn
- Skew-Fit: State-Covering Self-Supervised Reinforcement Learning Vitchyr H. Pong *, Murtaza Dalal*, Steven Lin*, Ashvin Nair, Shikhar Bahl, Sergey Levine
- Online Learning for Auxiliary Task Weighting for Reinforcement Learning Xingyu Lin*, Harjatin Singh Baweja*, George Kantor, David Held
- Sub-policy Adaptation for Hierarchical Reinforcement Learning Carlos Florensa, Alexander Li and Pieter Abbeel
- Successor Options: An Option Discovery Framework for Reinforcement Learning Rahul Ramesh *, Manan Tomar *, Balaraman Ravindran
- Write, Execute, Assess: Program Synthesis with a REPL Kevin Ellis *, Maxwell Nye *, Yewen Pu *, Felix Sosa * ,Joshua B. Tenenbaum , Armando Solar-Lezama
- Continual Learning with Tiny Episodic Memories Arslan Chaudhry, Marcu Rohrbach, Mohamed Elhoseiny, Thalaiyasingam Ajanthan, Puneet Dokania, Philip Torr and Marc'Aurelio Ranzato
- Reinforcement Learning without Ground Truth State Xingyu Lin, Harjatin Baweja and David Held.
- Learning Exploration Policies for Model-Agnostic Meta-Reinforcement Learning Swaminathan Gurumurthy, Sumit Kumar and Katia Sycara
- Language as an Abstraction for Hierarchical Deep Reinforcement Learning Yiding Jiang, Shixiang Gu, Kevin Murphy and Chelsea Finn.
- CloGAN - Closed Loop GAN to Counteract Forgetting Amanda Rios and Laurent Itti.
- Reward-guided Curriculum for Learning Robust Action Policies Siddharth Mysore, Robert Platt and Kate Saenko.
- Improved Transfer Learning in Street NavigationUsing Cross-Modal Policy Learning Ang Li *, Huiyi Hu *, Piotr Mirowski, Mehrdad Farajtabar
- Evaluating Influence Functions for Memory Replay in Continual Learning Sanket Vaibhav Mehta*, Bhargavi Paranjape* and Sumeet Singh*
- Challenge Learning: what doesn't kill you makes you stronger Jason Ma.
- Transfer Learning by Modelling Distribution over Policies Disha Shrivastava, Eeshan Gunesh Dhekane and Riashat Islam.
- Semi-Supervised Few-Shot Learning with Local and Global Consistency Ahmed Ayyad, Nassir Navab, Mohamed Elhoseiny*, Shadi Albarqouni *
- Continual Reinforcement Learning deployed in Real-life using Policy Distillation and Sim2Real Transfer Kalifou René Traoré, Hugo Caselles-Dupré, Timothée Lesort, Te Sun, Natalia Diaz-Rodriguez and David Filliat.
- Neural networks with motivation Sergey Shuvaev, Ngoc Tran, Marcus Stephenson-Jones, Bo Li and Alexei Koulakov.
- COBRA: Data-Efficient Model-Based RL through Unsupervised Object Discovery and Curiosity-Driven Exploration Nicholas Watters *, Loic Matthey *, Matko Bošnjak, Christopher P. Burgess, Alexander Lerchner
- Planning to Explore Unknown Environments from Pixels Danijar Hafner, Jimmy Ba, Timothy Lillicrap and Mohammed Norouzi.
- Option Discovery with Prediction Network Ensembles from Demonstrations in Unconstrained State Spaces Everett Fall, Michiaki Tatsubori, Don Joven Agravante, Masataro Asai, Shu Morikuni, Daiki Kimura, Subhajit Chaudhury, Asim Munawar and Liang-Gee Chen.
- Performance Evaluation of Opponent Modeling with Meta-Learning for Competitive Environments Mujtaba Hasan and Ankur Narang.
- Meta-Learning via Learned Loss Yevgen Chebotar*, Artem Molchanov*, Sarah Bechtle*, Ludovic Righetti, Franziska Meier, Gaurav Sukhatme
- Sub-Goal Trees – a Framework for Goal-Directed Trajectory Prediction and Optimization Tom Jurgenson, Edward Groshev, Aviv Tamar
- Online Continual Learning with Maximally Inferred Retrieval Rahaf Alundji*, Lucas Caccia*, Eugene Belilovsky*, Massimo Caccia*, Min Lin, Laurent Charlin, Tinne Tuytelaars
- Continual Learning of Generative Models with Maximum Entropy Generative Replay Cem Sübakan *, Massimo Caccia *, Timothée Lesort, Laurent Charlin
- Compositional Plan Vectors Coline Devin, Daniel Geng, Pieter Abbeel, Trevor Darrell, Sergey Levine
- Iterative Model-Based Reinforcement Learning Using Simulations in the Differentiable Neural Computer Adeel Mufti, Svetlin Penkov, Subramanian Ramamoorthy
- Option Discovery by Aiming to Predict Veronica Chelu, Doina Precup
- Learning Domain Randomization Distributions for Transfer of Locomotion Policies Melissa Mozifian*, Juan Camilo Gamboa Higuera*, David Meger, Gregory Dudek
- Which Tasks Should Be Learned Together in Multi-task Learning? Trevor Standley, Amir R. Zamir, Dawn Chen, Leonidas Guibas, Jitendra Malik, Silvio Savarese
Program Committee
- Dave Abel (Brown University)
- Anurag Ajay (MIT)
- Rahaf Aljundi (MILA/KU Leuven)
- Kumar Krishna Agrawal (Brain)
- Arslan Chaudhry (Oxford)
- Veronica Chelu (McGill/MILA)
- Ignasi Clavera (Berkeley)
- Coline Devin (Berkeley)
- Ishan Durugkar (UT Austin)
- Ashley Edwards (Georgia Tech)
- Ben Eysenbach (CMU/Brain)
- Sebastian Flennerhag (Alan Turing Institute)
- Jakob Foerster (Oxford)
- Vincent Francois-Lavet (McGill/MILA)
- Alexandre Galashov (DeepMind)
- Walter Goodwin (Oxford)
- Anirudh Goyal (MILA)
- Shixiang (Shane) Gu (Google Brain)
- Danijar Hafner (Toronto)
- Josiah Hanna (UT Austin)
- Tuomas Haarnoja (Berkeley)
- Jean Harb (McGill)
- Xu (Owen) He (Jacobs University)
- David Held (CMU)
- Chia-Man Hung (Oxford)
- Riashat Islam (McGill/Microsoft)
- Siddhant Jayakumar (DeepMind)
- Dinesh Jayaraman (Berkeley)
- Ryan Julian (USC)
- Aviral Kumar (Berkeley)
- Alex Lee (Berkeley)
- Marlos Machado (University of Alberta)
- Igor Mordatch (OpenAI)
- Ofir Nachum (Google)
- Anusha Nagabandi (Berkeley)
- Suraj Nair (Stanford)
- Sanmit Narvekar (UT Austin)
- Junhyuk Oh (DeepMind)
- Matthias Plappert (OpenAI)
- Vitchyr Pong (Berkeley)
- Guillaume Rabusseau (UdeM/MILA)
- Janarthanan Rajendran (University of Michigan)
- Kate Rakelly (Berkeley)
- Dushyant Rao (DeepMind)
- Martin Riedmiller (DeepMind)
- Matthew Riemer (IBM)
- Amanda Rios (University of South California)
- Andrei A. Rusu (DeepMind/UCL)
- Himanshu Sahni (Georgia Tech)
- Sasha Salter (Oxford)
- Devin Schwab (CMU)
- Jonathan Schwarz (DeepMind)
- Felipe Leno da Silva (University of São Paulo)
- Jakub Sygnowski (DeepMind)
- Chen Tessler (Technion)
- George Tucker (Brain)
- Adam White (University of Alberta)
- Kelvin Xu (Berkeley)
- Tianhe (Kevin) Yu (Stanford)
- Yuke Zhu (Stanford)
- Luisa M Zintgraf (Oxford)
Organizers
Advisors
Sponsors
We thank our sponsors for making this workshop possible.