GAMMA: Generalizable Articulation Modeling and Manipulation for Articulated Objects
Abstract
Articulated objects like cabinets and doors are widespread in daily life. However, directly manipulating 3D articulated objects is challenging because they have diverse geometrical shapes, semantic categories, and kinetic constraints. Prior works mostly focused on recognizing and manipulating articulated objects with specific joint types. They can either estimate the joint parameters or distinguish suitable grasp poses to facilitate trajectory planning. Although these approaches have succeeded in certain types of articulated objects, they lack generalizability to unseen objects, which significantly impedes their application in broader scenarios. In this paper, we propose a novel framework of Generalizable Articulation Modeling and Manipulating for Articulated Objects (GAMMA), which learns both articulation modeling and grasp pose affordance from diverse articulated objects with different categories. In addition, GAMMA adopts adaptive manipulation to iteratively reduce the modeling errors and enhance manipulation performance. We train GAMMA with the PartNet-Mobility dataset and evaluate with comprehensive experiments in SAPIEN simulation and real-world Franka robot. Results show that GAMMA significantly outperforms SOTA articulation modeling and manipulation algorithms in unseen and cross-category articulated objects.
Video
Framework
We collect RGB-D images of articulated objects like a cabinet to generate point clouds. The articulation modeling block segments the articulated parts and estimates the joint parameters. The grasp pose affordance block estimates the actionability of each grasp pose and chooses the ideal ones. In the adaptive manipulation, the articulation model provides open-loop trajectory planning and we iteratively update the joint parameters with actual trajectory to improve modeling accuracy and grasping success rate.
Qualitative Results
Simulation Articulated Objects Modeling
We implement ANCSH and GAMMA to model articulated objects in 7 categories. The first rows are images in the simulation environment. The second and third rows are segmentation results of articulated parts, marked as blue, green and dark green points. Each color represents a separate modeled articulated part. The red arrow and dot denote the estimated joint axis direction and origin position.
Simulation Part-aware Grasp pose Affordance
We present qualitative results of part-aware grasp pose affordance in the simulation environment. The first rows display images from the simulation environment, while the second rows depict part-aware grasp affordance results for articulated parts, represented by a gradient from red to green. Closer to green indicates a better grasp pose affordance.
Simulation Manipulation Experiments
Pushing door
Pushing drawer
Pulling door
Pulling drawer
We present the qualitative manipulation performance of GAMMA in four tasks: pushing doors, pushing drawers, pulling doors, and pulling drawers. In all of these tasks, GAMMA demonstrates strong generalizability and achieves a significant success rate.
Real-world Articulated Objects Modeling
We present the qualitative modeling of unseen real-world articulated objects, and our results demonstrate that the model trained on simulated data is capable of modeling articulated objects in real-world point clouds.
Real-world Part-aware Grasp pose Affordance
We present the qualitative part-aware grasp affordance of unseen real-world articulated objects, and our results demonstrate that the model trained on simulated data is capable of predicting affordance on real-world point clouds.
Real-world Manipulation Experiments
Open door on microware
Open drawer on cabinet
Open door on cabinet
We present the qualitative manipulation performance of unseen real-world articulated objects. The video is accelerated 6 times, and from the video, it's evident that GAMMA can consistently and smoothly perform tasks such as pulling doors and drawers.