Coline Devin*, Abhishek Gupta*, Trevor Darrell, Pieter Abbeel, Sergey Levine
*indicates authors contributed equally
Abstract :- Reinforcement learning (RL) can automate a wide variety of robotic skills, but learning each new skill requires considerable real-world data collection and manual representation engineering to design policy classes or features. Using deep reinforcement learning to train general purpose neural network policies alleviates some of the burden of manual representation engineering by using expressive policy classes, but exacerbates the challenge of data collection, since such methods tend to be less efficient than RL with low-dimensional, hand-designed representations. Transfer learning can mitigate this problem by enabling us to transfer information from one skill to another and even from one robot to another. We show that neural network policies can be decomposed into “task-specific” and “robot-specific” modules, where the task-specific modules are shared across robots, and the robot-specific modules are shared across all tasks on that robot. This allows for sharing task information, such as perception, between robots and sharing robot information, such as dynamics and kinematics, between tasks. We exploit this decomposition to train mix-and-match modules that can solve new robot-task combinations that were not seen during training. Using a novel neural network architecture, we demonstrate the effectiveness of our transfer method for enabling zero-shot generalization with a variety of robots and tasks in simulation for both visual and non-visual tasks.