FlowBot3D: Learning 3D Articulation Flow to Manipulate Articulated Objects

Ben Eisner*, Harry Zhang*, David Held

In Proceedings, Robotics Science and Systems (RSS) 2022

Paper Link

Code (GitHub)



We explore a novel method to perceive and manipulate 3D articulated objects that generalizes to enable the robot to articulate unseen classes of objects. We propose a vision-based system that learns to predict the potential motions of the parts of a variety of articulated objects to guide downstream motion planning  of the system to  articulate the objects. To predict the object motions, we train a network to output a dense vector field representing the point-wise motion direction of the points in the point cloud under articulation. The system then will deploy an analytical motion planning policy based on this vector field to achieve a grasp that yields maximum articulation. We train the vision system in simulation.  We then demonstrate the capability of our system to generalize to  unseen object instances and novel categories by testing our network in both simulation and real world, deployed on a Sawyer robot without retraining.


A collage of FlowBot3D in action (videos shown in 5x speed)

RSS Presentation Video

System Pipeline

FlowBot3D System Overview. Our system in deployment has two phases: the Grasp-Selection phase and the Articulation-Execution Phase. The dark red dots represent the predicted location of each point, and the light red lines represent the flow vectors connecting from the current time step's points to the predicted points. Note that the flow vectors are downsampled for visual clarity. In Grasp-Selection Phase, the agent observes the environment in the format of point cloud data. The point cloud data will then be post-processed and fed into the ArtFlowNet, which predicts per-point 3D flow vectors. The system then chooses the point that has the maximum flow vector magnitude and deploys motion planning to make contact with the chosen point using suction. In Articulation-Execution phase, after making suction contact with the chosen argmax point, the system iteratively observes the pointcloud data and predicts the 3D flow vectors. In this phase, the motion planning module would guide the robot to follow the maximum observable flow vector's direction and articulate the object of interest repeatedly.

Interactive 3D Articulation Flow (3D AF) Prediction Visualization Tool

Three-dimensional visualization  of the 3D AF prediction on the point cloud data of a complete door opening rollout. Each red vector represents the predicted motion direction and location of the corresponding blue point. Click the square to pause first and drag around to view the vectors from different angles.  To resume, click the "play" button.