Haewon Jung*, Donguk Lee*, Haecheol Park, JunHyeop Kim, Beomjoon Kim
Kim Jaechul Graduate School of AI at Korea Advanced Institute of Science and Technology (KAIST)
* Equal Contribution
Abstract
Current robots face challenges in manipulation tasks that require a long sequence of prehensile and nonprehensile skills. This involves handling contact-rich interactions and chaining multiple skills while considering their long-term consequences. This paper presents a framework that leverages imitation learning to distill a planning algorithm, capable of solving long-horizon problems but requiring extensive computation time, into a policy for efficient action inference. We introduce Skill-RRT, an extension of the rapidly-exploring random tree (RRT) that incorporates skill applicability checks and intermediate object pose sampling for efficient long-horizon planning. To enable skill chaining, we propose connectors, goal-conditioned policies that transition between skills while minimizing object disturbance. Using lazy planning, connectors are selectively trained on relevant transitions, reducing the cost of training. High-quality demonstrations are generated with Skill-RRT and refined by a noise-based replay mechanism to ensure robust policy performance. The distilled policy, trained entirely in simulation, zero-shot transfer to the real world, and achieves over 80% success rates across three challenging manipulation tasks. In simulation, our approach outperforms the state-of-the-art skill-based reinforcement learning method, MAPLE, and Skill-RRT.
Overview
Real-World Experiments