Grasping is among the most fundamental and long-lasting problems in robotics study. This paper studies the problem of 6-DoF(degree of freedom) grasping by a parallel gripper in a cluttered scene captured using a commodity depth sensor from a single viewpoint. We address the problem in a learning-based framework. At the high level, we rely on a single-shot grasp proposal network, trained with synthetic data and tested in real-world scenarios. Our single-shot neural network architecture can predict amodal grasp proposal efficiently and effectively. Our training data synthesis pipeline can generate scenes of complex object configuration and leverage an innovative gripper contact model to create dense and high-quality grasp annotations. Experiments in synthetic and real environments have demonstrated that the proposed approach can outperform state-of-the-arts by a large margin.
6-DOF grasping is essential to allow dexterous object manipulation task but is difficult to generate from incomplete shape
Our method is noise-eesistant and amodal, i.e., being able to make an educated guess of the viable grasp from only partial point cloud captured by commercial depth sensor
Existing geometry-based 6-DOF grasping apply sample-based method, which is sparse and computational expensive.
Instead, we propose to directly regress 6-DOF grasps from the entire scene point cloud in one pass.