LEAGUE: Guided Skill Learning and Abstraction for Long-Horizon Manipulation

Shuo Cheng, Danfei Xu

Georgia Tech Robot Learning and Reasoning Lab

Robotics and Automation Letters (RA-L), 2023 (Best Paper Honorable Mention, 1 of 5 over 1200+ accepted papers)

Long Horizon Planning Workshop (CoRL), 2022 (Best Paper Finalist)

[paper][code (coming)]

Abstract

To assist with everyday human activities, robots must solve complex long-horizon tasks and generalize to new settings. Recent deep reinforcement learning (RL) methods show promise in fully autonomous learning, but they struggle to reach long-term goals in large environments. On the other hand, Task and Motion Planning (TAMP) approaches excel at solving and generalizing across long-horizon tasks, thanks to their powerful state and action abstractions. But they assume predefined skill sets, which limits their real-world applications. In this work, we combine the benefits of these two paradigms and propose an integrated task planning and skill learning framework named LEAGUE (Learning and Abstraction with Guidance). LEAGUE leverages the symbolic interface of a task planner to guide RL-based skill learning and creates abstract state space to enable skill reuse. More importantly, LEAGUE learns manipulation skills in-situ of the task planning system, continuously growing its capability and the set of tasks that it can solve. We evaluate LEAGUE on four challenging simulated task domains and show that LEAGUE outperforms baselines by large margins. We also show that the learned skills can be reused to accelerate learning in new tasks domains and transfer to a physical robot platform.

Method Overview

RAL_League.mp4

To better understand the progressive skill learning strategy, we plot the proficiency level of each skill throughout the process of learning the task StowHammer.

Quantitative Results

We visualize the key stages of the three evaluation tasks. (Bottom) We compare relevant methods on the three task domains. The plot shows the corresponding task completion progress (0 for initial state, 1 for task completion) throughout training. The results are reported using four random seeds, with standard deviation shown as the shaded area.

Simulation Demos

peg_train.mp4

Video demo for trained task goal in PegInHole domain.

hammer_train.mp4

Video demo for trained task goal in StowHammer domain.

Generalization to New Tasks

peg_new_goal.mp4

Video demo for novel task goals in PegInHole domain, the learned skills are reused without any updating.

hammer_new_goal.mp4

Video demo for novel task goals in StowHammer domain, the learned skills are reused without any updating.

Resuing Skills in New Domain

We reuse some learned skills (Pick(?object), Push(?cabinet), Pull(?cabinet)) from the StowHammer domain to the MakeCoffee domain, the result showing that our learned skills also support transferring across domains to accelerate novel skill learning.

coffee_train.mp4

Video demo for trained task goal in MakeCoffee domain.