Learning Dense Visual Correspondences in Simulation to Smooth and Fold Real Fabrics

Aditya Ganapathi, Priya Sundaresan, Brijen Thananjeyan, Ashwin Balakrishna,

Daniel Seita, Jennifer Grannen, Minho Hwang, Ryan Hoque,

Joseph E. Gonzalez, Nawid Jamali, Katsu Yamane, Soshi Iba, Ken Goldberg


Robotic fabric manipulation is challenging due to the infinite dimensional configuration space, self-occlusion, and complex dynamics of fabrics. There has been significant prior work on learning policies for specific deformable manipulation tasks, but comparatively less focus on algorithms which can efficiently learn many different tasks. In this paper, we learn visual correspondences for deformable fabrics across different configurations in simulation and show that this representation can be used to design policies for a variety of tasks. Given a single demonstration of a new task from an initial fabric configuration, the learned correspondences can be used to compute geometrically equivalent actions in a new fabric configuration. This makes it possible to robustly imitate a broad set of multi-step fabric smoothing and folding tasks on multiple physical robotic systems. The resulting policies achieve 80.3% average task success rate across 10 fabric manipulation tasks on two different robotic systems, the da Vinci surgical robot and the ABB YuMi. Results also suggest robustness to fabrics of various colors, sizes, and shapes.

Video Submission


Descriptor Learning

We first learn fabric descriptors by leveraging point-pair correspondences between fabric in different configurations to learn a descriptor space which is invariant to fabric configuration by building off of prior work on learning dense object descriptors for deformable manipulation.


Example of the learned descriptors across two images of a pink t-shirt. The predicted correspondences are shown on the right.


Example of the learned descriptors across an image of simulated cloth (left) and real cloth (center). The predicted correspondences are shown on the right.


Example of the learned descriptors across two real images of the cloth. The predicted correspondences are shown on the right.


Simulated Fabric Manipulation

We first rollout policies in a Blender simulation environment on square cloth and T-shirt folding tasks. We find that the descriptors are able to accurately localize correspondences in fabric with different colors and configurations and imitate folding sequences in novel fabric configurations with a single provided demonstration.

Physical Fabric Manipulation

We find that the learned policies transfer effectively to two different physical robotic systems, anABB YuMi and da Vinci surgical kit (dVRK), and can successfully perform fabric smoothing and folding tasks in novel configurations on both robots given a single demonstration of each task.