RLAfford: End-to-End Affordance Learning for Robotic Manipulation

Yiran Geng∗, Boshi An∗, Haoran Geng, Yuanpei Chen, Yaodong Yang†, Hao Dong†

* indicates equal contribution; † indicates Corresponding Author

Github

Arxiv

Dataset

Video

Introduction Video

video.pptx - PowerPoint 2022-09-24 13-56-33_10.mp4

Experiment Results

CloseDoor_1.mp4

PushDrawer_1.mp4

OpenDoor_1.mp4

OpenDrawer_1.mp4

OpenPotLid_1.mp4

PushStapler_1.mp4

Pick&Place_1.mp4

DualArmPush_1.mp4

Abstract

Learning to manipulate 3D objects in an interactive environment has been a challenging problem in Reinforcement Learning (RL). In particular, it is hard to train a policy that can generalize over objects with different semantic categories, diverse shape geometry and versatile functionality. Recently, the technique of visual affordance has shown great prospects in providing object-centric information priors with effective actionable semantics. As such, an effective policy can be trained to open a door by knowing how to exert force on the handle. However, to learn the affordance, it often requires human-defined action primitives, which limits the range of applicable tasks. In this study, we take advantage of visual affordance by using the contact information generated during the RL training process to predict contact maps of interest. Such contact prediction process then leads to an end-to-end affordance learning framework that can generalize over different types of manipulation tasks. Surprisingly, the effectiveness of such framework holds even under the multi-stage and the multi-agent scenarios. We tested our method on eight types of manipulation tasks. Results showed that our methods outperform baseline algorithms, including visual-based affordance methods and RL methods, by a large margin on the success rate.

Pipeline

Complete Quantitative Results

We listed quantitative results of our method along with the baselines and ablations. The number and subscript denotes the average value and variance on 8 seeds. The variance of average success rate (ASR) is computed upon the ASR of different objects, the variance of Master Percentage (MP) is the average variance over different seeds.

Single-stage Tasks

Multi-stage Task

(Pick-and-Place)

Multi-agent Task

(Dual-Arm-Push)

More Affordance Map Results

Affordance Map of Close Door

Affordance Map of Open Door

Affordance Map of Push Drawer

Affordance Map of Pull Drawer

Affordance Map of Dual Arm Push

Affordance Map of Open Pot Lid

Affordance Map of Push Stapler

Affordance Map of Pick and Place

Reward Design

In order to train effective policies in simulator, we specially designed reward for each task. Experiments show that a good reward design significantly improves the success rate. Here are the detailed description of the reward function.

Close Door

Close Drawer

Open Door

Open Drawer

Push Stapler

Open Pot

Push Chair

Pick and Place