Abstract
State estimation is one of the greatest challenges for cloth manipulation due to cloth's high dimensionality and self-occlusion. Prior works propose to identify the full state of crumpled clothes by training a mesh reconstruction model in simulation. However, such models are prone to suffer from a sim-to-real gap due to differences between cloth simulation and the real world. In this work, we propose a self-supervised method to finetune a mesh reconstruction model in the real world. Since the full mesh of crumpled cloth is difficult to obtain in the real world, we design a special data collection scheme and an action-conditioned model-based cloth tracking method to generate pseudo-labels for self-supervised learning. By finetuning the pretrained mesh reconstruction model on this pseudo-labeled dataset, we show that we can improve the quality of the reconstructed mesh without requiring human annotations.
Method
Our self-supervised mesh reconstruction can be divided into 3 stages:
Collect dataset in real world. The dataset consists of sequences of observations, which start from flattened cloth. A human collector will use a tweezer to create crumpled configurations by random pick-and-place actions. All actions are recorded.
Generate pseudo label by a action-conditioned tracker. We first use a pretrained mesh reconstruction model to estimate the initial flattened mesh. Then we track the motion across frames to obtain full mesh at crumpled configurations.
Finetune the pretrained mesh reconstruction model.
Visualization of Pseudo Mesh
Action 1
Trajectory 1
Action 2
Action 3
Action 1
Trajectory 2
Action 2
Action 3
Action 1
Trajectory 3
Action 2
Action 3