Planning with Spatial-Temporal Abstraction from Point Clouds for Deformable Object Manipulation

Xingyu Lin*, Carl Qi*, Yunchu Zhang,  Zhiao Huang, Katerina Fragkiadaki, Yunzhu Li, Chuang Gan, David Held

(*equal contribution)

In Conference on Robot Learning (CoRL) 2022

Abstract

Effective planning of long-horizon deformable object manipulation requires suitable abstractions at both the spatial and temporal levels. Previous methods typically either focus on short-horizon tasks or make strong assumptions that full-state information is available, which prevents their use on deformable objects. In this paper, we propose PlAnning with Spatial-Temporal Abstraction (PASTA), which incorporates both spatial abstraction (reasoning about objects and their relations to each other) and temporal abstraction (reasoning over skills instead of low-level actions). Our framework maps high-dimension 3D observations such as point clouds into a set of latent vectors and plans over skill sequences on top of the latent set representation. We show that our method can effectively perform challenging sequential deformable object manipulation tasks in the real world, which require combining multiple tool-use skills such as cutting with a knife, pushing with a pusher, and spreading dough with a roller. 

CoRL 2022 Presentation

Below is our 1-minute presentation video for CoRL 2022. You can find the full version of the slides here.

332-Qi-PlanningWith SpatialTemporal AbstractionFromPointCloudsForDeformableObjectManipulation.mp4

Supplementary Video

In the video below, we demonstrate our method solving three long-horizon dough manipulation tasks in the real world, using both kinetic sand as the replacement of real dough for easier experiments. We also show performance on real dough towards the end of the video. For each task, we show the top-down view on the top left, overlayed with the point cloud subgoal generated by our method. In the bottom left, we show the given target, also from top-down view. On the top-right, we show our robot switching tools between skills from the side view.

Method Overview

Learn skill abstraction from demonstration

Our method starts with a set of demonstration trajectories for each skill, generated in a differentiable simulator each using one tool. We then sample point clouds (pc) from the demonstration trajectories to train our set skill abstraction modules. 

Planning and execution

For planning, we first map point clouds of the observation and the target  into latent set representations. We can then plan in the latent space to find the skill indexes and the sub-goals in the latent space.  Finally, we can map the sub-goals back to point clouds for the policy to execute them.  

The illustrated task here is to cut a piece of dough into two with a cutter, transport the pieces to the spreading area on the left (with a high-friction surface) using a pusher, and then flatten both pieces with a roller. This is a novel task not seen during training, as we only provide skills during training. 

Real World Rollouts

Each of the link below connects to a sub-page, where we show the videos of all the real world experiments we have done. This includes three tasks, with three baselines (Our method, our Flat3D ablation and a human baseline), each with 5 trials. The quantitative performance of different methods is shown in the table below. 

Click to see more videos: 

Failure Cases

Additionally, we provide failure cases of PASTA execution on real dough. Note that in these failure cases, our planner can still generate successful plans despite the shape difference between the real dough and the dough in simulation. However,  due to quick changes in its material property, the real dough has dynamics that differ a lot from simulation, and is especially hard to manipulate with consistency. As a result, the policy sometimes fails to carry out the plan. We provide example failure cases in the subpage linked below.

Click to see videos: [Failure cases on real dough]

Simulation Rollouts

We provide the rollouts of PASTA and Flat 3D for the tasks in simulation.  Additionally, we overlay the normalized performance (evaluation metric used in our experiments) on each trajectory to indicate the progress of execution.

Click to see more videos: [Simulation rollouts]

Appendix

CoRL_2022_PASTA_Appendix.pdf