Pre and Post-Contact Policy Decomposition for Non-Prehensile Manipulation with Zero-Shot Sim-To-Real Transfer

Minchan Kim, Junhyek Han, Jaehyung Kim, and Beomjoon Kim

Intelligent Mobile Manipulation Lab

KAIST

Abstract

We present a system for non-prehensile manipulation problems that require a significant number of contact mode transitions and the use of environmental contacts to successfully manipulate an object to a target location. Our method is based on deep reinforcement learning which, unlike state-of-the-art planning algorithms, does not require apriori knowledge of the physical parameters of the object or environment such as friction coefficients or centers of mass. The planning time is reduced to the simple feed-forward prediction time on a neural network. We propose a computational structure, action space design, and curriculum learning scheme that facilitates efficient exploration and sim-to-real transfer. In challenging real-world non-prehensile manipulation tasks, we show that our method can generalize over different objects, and succeed even for novel objects not seen during training.

Overview

We train our key point detector, pre-contact policy, and post-contact policy entirely in simulation, and transfer it to the real world without additional training.

Experiments

Bump domain (light sponge, 2x speed)

Bump domain (using hand, 1x speed)

Bump domain (disturbed by human, 1x speed)

Wall domain (painted wood, 1x speed)

Wall domain (emergent behavior, 1x speed)

Wall domain (painted wood, 1x speed)

Card domain (wood, 2x speed)

Card domain (disturbed by human, 1x speed)