Week 8: Multi-Task and Meta RL
Single-Task and Transfer
Eric Tzeng*, Coline Devin*, Judy Hoffman, Chelsea Finn, Pieter Abbeel, Sergey Levine, Kate Saenko, Trevor Darrell, Adapting Deep Visuomotor Representations with Weak Pairwise Constraints, WAFR 2016
Benjamin Eysenbach, Swapnil Asawa, Shreyas Chaudhari, Sergey Levine, Ruslan Salakhutdinov, Off-Dynamics Reinforcement Learning: Training for Transfer with Domain Classifiers, ICLR 2021
Tuomas Haarnoja, Haoran Tang, Pieter Abbeel, Sergey Levine, Reinforcement Learning with Deep Energy-Based Policies, ICML 2017
Saurabh Kumar, Aviral Kumar, Sergey Levine, Chelsea Finn, One Solution is Not All You Need: Few-Shot Extrapolation via Structured MaxEnt RL, NeurIPS 2020
Domain Randomization
Aravind Rajeswaran, Sarvjeet Ghotra, Balaraman Ravindran, Sergey Levine, EPOpt: Learning Robust Neural Network Policies Using Model Ensembles, ICLR 2017
Wenhao Yu, Jie Tan, C. Karen Liu, Greg Turk, Preparing for the Unknown: Learning a Universal Policy with Online System Identification, RSS 2017
Fereshteh Sadeghi, Sergey Levine, CAD2RL: Real Single-Image Flight without a Single Real Image, RSS 2017
Xue Bin (jason) Peng, Marcin Andrychowicz, Wojciech Zaremba, Pieter Abbeel, Sim-to-Real Transfer of Robotic Control with Dynamics Randomization, ICRA 2018
Stephen James, Andrew J. Davison, Edward Johns, Transferring End-to-End Visuomotor Control from Simulation to Real World for a Multi-Stage Task, CoRL 2017
Josh Tobin, Rachel Fong, Alex Ray, Jonas Schneider, Wojciech Zaremba, Pieter Abbeel, Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World, IROS 2017
Multi-Task
Andrei A. Rusu, Sergio Gomez Colmenarejo, Caglar Gulcehre, Guillaume Desjardins, James Kirkpatrick, Razvan Pascanu, Volodymyr Mnih, Koray Kavukcuoglu, Raia Hadsell, Policy Distillation, 2016
Emilio Parisotto, Jimmy Lei Ba, Ruslan Salakhutdinov, Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning, ICLR 2016
Dibya Ghosh, Avi Singh, Aravind Rajeswaran, Vikash Kumar, Sergey Levine, Divide-and-Conquer Reinforcement Learning, ICLR 2018
Peter Dayan, Improving Generalisation for Temporal Difference Learning: The Successor Representation, 1993
André Barreto, Will Dabney, Rémi Munos, Jonathan J. Hunt, Tom Schaul, Hado van Hasselt, David Silver, Successor Features for Transfer in Reinforcement Learning, NeurIPS 2017
Yee Whye Teh, Victor Bapst, Wojciech Marian Czarnecki, John Quan, James Kirkpatrick, Raia Hadsell, Nicolas Heess, Razvan Pascanu, Distral: Robust Multitask Reinforcement Learning, NeurIPS 2017
Dmitry Kalashnikov, Jacob Varley, Yevgen Chebotar, Benjamin Swanson, Rico Jonschkowski, Chelsea Finn, Sergey Levine, Karol Hausman, MT-Opt: Continuous Multi-Task Robotic Reinforcement Learning at Scale, 2021
Matteo Hessel, Hubert Soyer, Lasse Espeholt, Wojciech Czarnecki, Simon Schmitt, Hado van Hasselt, Multi-task Deep Reinforcement Learning with PopArt, AAAI 2019
Karl Cobbe, Christopher Hesse, Jacob Hilton, John Schulman, ProcGen: Leveraging Procedural Generation to Benchmark Reinforcement Learning, 2019
Ruihan Yang, Huazhe Xu, Yi Wu, Xiaolong Wang, Multi-Task Reinforcement Learning with Soft Modularization, NeurIPS 2020
Jiachen Li, Quan Vuong, Shuang Liu, Minghua Liu, Kamil Ciosek, Keith Ross, Henrik Iskov Christensen, Hao Su, Multi-task Batch Reinforcement Learning with Metric Learning, NeurIPS 2020
Shagun Sodhani, Amy Zhang, Joelle Pineau, Multi-Task Reinforcement Learning with Context-based Representations, ICML 2021
Abhishek Gupta, Justin Yu, Tony Z. Zhao, Vikash Kumar, Aaron Rovinsky, Kelvin Xu, Thomas Devlin, Sergey Levine, Reset-Free Reinforcement Learning via Multi-Task Learning: Learning Dexterous Manipulation Behaviors without Human Intervention, 2021
Meta Reinforcement Learning
Yan Duan, John Schulman, Xi Chen, Peter L. Bartlett, Ilya Sutskever, Pieter Abbeel, RL2: Fast Reinforcement Learning via Slow Reinforcement Learning, 2016
Jane X Wang, Zeb Kurth-Nelson, Dhruva Tirumala, Hubert Soyer, Joel Z Leibo, Remi Munos, Charles Blundell, Dharshan Kumaran, Matt Botvinick, Learning to reinforcement learn, 2016
Nikhil Mishra, Mostafa Rohaninejad, Xi Chen, Pieter Abbeel, A Simple Neural Attentive Meta-Learner, ICLR 2018
Abhishek Gupta, Russell Mendonca, YuXuan Liu, Pieter Abbeel, Sergey Levine, Meta-Reinforcement Learning of Structured Exploration Strategies, NeurIPS 2018
Kate Rakelly, Aurick Zhou, Deirdre Quillen, Chelsea Finn, Sergey Levine, Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables, ICML 2019
Chelsea Finn, Pieter Abbeel, Sergey Levine, Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks, ICML 2017
Jakob Foerster, Gregory Farquhar, Maruan Al-Shedivat, Tim Rocktäschel, Eric P. Xing, Shimon Whiteson, DiCE: The Infinitely Differentiable Monte-Carlo Estimator, ICML 2018
Jonas Rothfuss, Dennis Lee, Ignasi Clavera, Tamim Asfour, Pieter Abbeel, ProMP: Proximal Meta-Policy Search, ICLR 2019
Bradly C. Stadie, Ge Yang, Rein Houthooft, Xi Chen, Yan Duan, Yuhuai Wu, Pieter Abbeel, Ilya Sutskever, Some Considerations on Learning to Explore via Meta-Reinforcement Learning, NeurIPS 2018
Rein Houthooft, Richard Y. Chen, Phillip Isola, Bradly C. Stadie, Filip Wolski, Jonathan Ho, Pieter Abbeel, Evolved Policy Gradients, NeurIPS 2018
Chrisantha Thomas Fernando, Jakub Sygnowski, Simon Osindero, Jane Wang, Tom Schaul, Denis Teplyashin, Pablo Sprechmann, Alexander Pritzel, Andrei A. Rusu, Meta-Learning by the Baldwin Effect, GECCO 2018
L Zintgraf, M Igl, K Shiarlis, A Mahajan, K Hofmann, Shimon Whiteson, Variational task embeddings for fast adaptation in deep reinforcement learning, ICLR 2019 Workshops
Nagabandi*, Clavera*, Liu, Fearing, Abbeel, Levine, Finn, Learning to Adapt in Dynamic, Real-World Environments Through Meta-Reinforcement Learning, ICLR 2019
Charles Packer, Pieter Abbeel, Joseph Gonzalez, Hindsight Task Relabelling: Experience Replay for Sparse Reward Meta-RL, NeurIPS 2021 [pdf forthcoming]