GAMMA: Generalizable Articulation Modeling and Manipulation  for Articulated Objects

Abstract

Articulated objects like cabinets and doors are widespread in daily life. However, directly manipulating 3D articulated objects is challenging because they have diverse geometrical shapes, semantic categories, and kinetic constraints. Prior works mostly focused on recognizing and manipulating articulated objects with specific joint types. They can either estimate the joint parameters or distinguish suitable grasp poses to facilitate trajectory planning. Although these approaches have succeeded in certain types of articulated objects, they lack generalizability to unseen objects, which significantly impedes their application in broader scenarios. In this paper, we propose a novel framework of Generalizable Articulation Modeling and Manipulating for Articulated Objects (GAMMA), which learns both articulation modeling and grasp pose affordance from diverse articulated objects with different categories. In addition, GAMMA adopts adaptive manipulation to iteratively reduce the modeling errors and enhance manipulation performance. We train GAMMA with the PartNet-Mobility dataset and evaluate with comprehensive experiments in SAPIEN simulation and real-world Franka robot. Results show that GAMMA significantly outperforms SOTA articulation modeling and manipulation algorithms in unseen and cross-category articulated objects. 

Video

Framework

We collect RGB-D images of articulated objects like a cabinet to generate point clouds. The articulation modeling block segments the articulated parts and estimates the joint parameters. The grasp pose affordance block estimates the actionability of each grasp pose and chooses the ideal ones. In the adaptive manipulation, the articulation model provides open-loop trajectory planning and we iteratively update the joint parameters with actual trajectory to improve modeling accuracy and grasping success rate.

Qualitative Results 

Simulation  Articulated Objects Modeling

We implement ANCSH and GAMMA to model articulated objects in 7 categories. The first rows are images in the simulation  environment. The second and third rows are segmentation results of articulated parts, marked as blue, green and dark green points. Each color represents a separate modeled articulated part. The red arrow and dot denote the estimated joint axis direction and origin position.

 Simulation Part-aware Grasp pose Affordance

We present qualitative results of part-aware grasp pose affordance in the simulation environment. The first rows display images from the simulation environment, while the second rows depict part-aware grasp affordance results for articulated parts, represented by a gradient from red to green. Closer to green indicates a better grasp pose affordance. 

Simulation  Manipulation Experiments

Pushing door

Pushing drawer

Pulling door

Pulling drawer

We present the qualitative manipulation performance of GAMMA in four tasks: pushing doors, pushing drawers, pulling doors, and pulling drawers. In all of these tasks, GAMMA demonstrates strong generalizability and achieves a significant success rate. 

Real-world Articulated Objects Modeling

We present the qualitative modeling of unseen real-world articulated objects, and our results demonstrate that the model trained on simulated data is capable of modeling articulated objects in real-world point clouds. 

Real-world Part-aware Grasp pose Affordance

We present the qualitative part-aware grasp affordance of unseen real-world articulated objects, and our results demonstrate that the model trained on simulated data is capable of predicting affordance on real-world point clouds.

Real-world Manipulation Experiments

Open door on microware

Open drawer on cabinet

Open door on cabinet

We present the qualitative manipulation performance of unseen real-world articulated objects. The video is accelerated 6 times, and from the video, it's evident that GAMMA can consistently and smoothly perform tasks such as pulling doors and drawers.