NeuGraspNet

NeuGraspNet: Learning Any-View 6DoF Robotic Grasping in Cluttered Scenes via Neural Surface Rendering

Snehal Jauhri, Ishikaa Lunawat, and Georgia Chalvatzaki
PEARL Lab, TU Darmstadt, Germany

Robotics: Science and Systems (RSS) 2024

Spotlight @ ICRA 2024 RoboNerF Workshop

Paper

Video

Code

TL;DR: A re-interpretation of robotic grasping as neural surface rendering for learning global and local representations that enable effective any-view grasping

Our method, NeuGraspNet, uses just a single random-view depth input, encodes the scene in an implicit feature volume, & uses multi-level rendering to select relevant features & predict grasping functions. NeuGraspNet generalizes to random-view mobile manipulation grasping scenarios.

Summary:

We introduce a novel, fully implicit 6DoF grasp detection method, NeuGraspNet, that re-interprets robotic grasping as surface rendering and predicts high-fidelity grasps from any random single viewpoint of a scene. Our method exploits a learned implicit geometric scene representation to perform global and local surface rendering. This enables effective grasp candidate generation (using global features) and grasp quality prediction (using local features from a shared feature space). Our local neural surface rendering allows the model to encode the interaction between the robot's end-effector and the object's surface geometry. NeuGraspNet outperforms existing implicit and semi-implicit baseline methods in the literature. We demonstrate the real-world applicability of NeuGraspNet with a mobile manipulator robot, grasping in open spaces with clutter by rendering the scene, reasoning about graspable areas of different objects, and selecting grasps likely to succeed without colliding with the environment.

Video demonstration:

NeuGraspNet: A single-view 3D Truncated Signed Distance Field (TSDF) grid is processed through a convolutional occupancy network to reconstruct the scene. The occupancy network is used to perform global, scene-level rendering. The rendered scene is used for grasp candidate generation in SE(3). We re-interpret grasping as rendering of local surface points and query their features from the shared 3D feature volume. Local points, their features, and the 6DoF grasp pose are passed to a Grasping PointNetwork to predict per grasp quality. NeuGraspNet effectively learns the interaction between the objects' geometry and the gripper to detect high-fidelity grasps.

BibTeX:

@article{jauhri2024learning,
title={Learning any-view 6dof robotic grasping in cluttered scenes via neural surface rendering},
author={Jauhri, Snehal and Lunawat, Ishikaa and Chalvatzaki, Georgia},
booktitle={Robotics: Science and Systems (RSS)},
pdf={https://www.roboticsproceedings.org/rss20/p046.pdf},
year={2024},
}

Page updated

Google Sites

Report abuse