FlowBot3D: Learning 3D Articulation Flow to Manipulate Articulated Objects
Ben Eisner*, Harry Zhang*, David Held
In Proceedings, Robotics Science and Systems (RSS) 2022
Abstract
We explore a novel method to perceive and manipulate 3D articulated objects that generalizes to enable the robot to articulate unseen classes of objects. We propose a vision-based system that learns to predict the potential motions of the parts of a variety of articulated objects to guide downstream motion planning of the system to articulate the objects. To predict the object motions, we train a network to output a dense vector field representing the point-wise motion direction of the points in the point cloud under articulation. The system then will deploy an analytical motion planning policy based on this vector field to achieve a grasp that yields maximum articulation. We train the vision system in simulation. We then demonstrate the capability of our system to generalize to unseen object instances and novel categories by testing our network in both simulation and real world, deployed on a Sawyer robot without retraining.
A collage of FlowBot3D in action (videos shown in 5x speed)
RSS Presentation Video
System Pipeline
FlowBot3D System Overview. Our system in deployment has two phases: the Grasp-Selection phase and the Articulation-Execution Phase. The dark red dots represent the predicted location of each point, and the light red lines represent the flow vectors connecting from the current time step's points to the predicted points. Note that the flow vectors are downsampled for visual clarity. In Grasp-Selection Phase, the agent observes the environment in the format of point cloud data. The point cloud data will then be post-processed and fed into the ArtFlowNet, which predicts per-point 3D flow vectors. The system then chooses the point that has the maximum flow vector magnitude and deploys motion planning to make contact with the chosen point using suction. In Articulation-Execution phase, after making suction contact with the chosen argmax point, the system iteratively observes the pointcloud data and predicts the 3D flow vectors. In this phase, the motion planning module would guide the robot to follow the maximum observable flow vector's direction and articulate the object of interest repeatedly.