6D Pose Estimation Network for Grasping Performance Comparison

Key skills:

ICP / Pointnet++ / Isaac Gym / 6D pose estimation / Pytorch

[Network_Training_Repo] [Dataset Generation Repo]

Motivation

Our lab team was working on the follow-up for the SoTA learning-based bin-picking models [1], and trying to find some ways to improve the model. Since this model is mainly trained through analytic grasp train data, we wanted to compare the grasp qualities between the model output, analytic output, and in reality. For the comparison, the robot should keep precise track on the grasp point, and precise object pose estimation was requried.

Method

We first implement Pointnet++ architecture [2] to initially estimate the rough 6D pose of the object.

Next, we use ICP (Iterative Closest Point) registration between the point cloud from depth camera and point cloud of the object known in advance.

When training the model, we use two different losses: rotation loss (quaternion - cosine similarity loss) and position loss (MSE loss).

You can download the code through this link.

Data Preparation

We used simulation-only train data, made from NVIDIA Isaac Gym simulator.

In the simulator, we place the objects in random stable poses, with random camera poses. In this regard, the pose estimation model can estimate the pose of the object robustly to the camera pose.

You can download the code through this link.

Result

The model could estimate the pose of the given object in 1mm error of point-by-point distance between the ground truth pose.

References

[1] Mahler, J., Matl, M., Satish, V., Danielczuk, M., DeRose, B., McKinley, S., & Goldberg, K. (2019). Learning ambidextrous robot grasping policies. Science Robotics, 4(26), eaau4984.

[2] Qi, C. R., Yi, L., Su, H., & Guibas, L. J. (2017). Pointnet++: Deep hierarchical feature learning on point sets in a metric space. Advances in neural information processing systems, 30.