Learning Closed-loop Dough Manipulation Using a Differentiable Reset Module

Carl Qi, Xingyu Lin, David Held

Robotics and Automation Letters (RA-L) with presentation at the International Conference on Intelligent Robots and Systems (IROS), 2022

[Paper (arXiv)]   [Presentation]   [Code]

Abstract

Deformable object manipulation has many applications such as cooking and laundry folding in our daily lives. Manipulating elastoplastic objects such as dough is particularly challenging because dough lacks a compact state representation and requires contact-rich interactions. We consider the task of flattening a piece of dough into a specific shape from RGB-D images. While the task is seemingly intuitive for humans, there exist local optima for common approaches such as naive trajectory optimization. We propose a novel trajectory optimizer that optimizes through a differentiable "reset" module, transforming a single-stage, fixed-initialization trajectory into a multistage, multi-initialization trajectory where all stages are optimized jointly. We then train a closed-loop policy on the demonstrations generated by our trajectory optimizer. Our policy receives partial point clouds as input, allowing ease of transfer from simulation to the real world. We show that our policy can perform real-world dough manipulation, flattening a ball of dough into a target shape.

Presentation

IROS22_2957.mp4

System overview

(a) Our trajectory optimizer implements a differentiable module that resets the tool to avoid local optima and allows gradient information to propagate through the entire trajectory. (b) We perform imitation learning on the demonstration data generated by the trajectory optimizer. Our policy takes segmented point cloud observations as input.  (c) We use the learned policy to control a sawyer robot to perform closed-loop rolling in the real world.


Dough rolling with a sawyer arm

Real world setup

Quantitative performances

Distance to target: 0-3cm

Final views and performances

Executions

SAC

Open-loop

Heuristic

our

Diff-Reset-BC (ours)

output.mp4

Human

output2.mp4

Distance to target: 3-6cm

Final views and performances

Executions

SAC

Open-loop

Heuristic

Diff-Reset-BC (ours)

output.mp4

Human

output2.mp4

Distance to target: 6-9cm

Final views and performances

Executions

SAC

Open-loop

Heuristic

Diff-Reset-BC (ours)

output.mp4

Human

output2.mp4

Dough size: small (240 g)

Final views and performances

Executions

SAC

Open-loop

Heuristic

Diff-Reset-BC (ours)

output.mp4

Human

output2.mp4

Dough size: medium (360 g)

Final views and performances

Executions

SAC

Open-loop

Heuristic

our

Diff-Reset-BC (ours)

output.mp4

Human

output2.mp4

Dough size: large (480 g)

Final views and performances

Executions

SAC

Open-loop

Heuristic

Diff-Reset-BC (ours)

output.mp4

Human

output2.mp4

Dough rolling in simulation

There are 10 configurations above, each separated by boundries in gray. Every configuration contains the execution of the BC policy (left) and the expert demonstration from Diff-Reset (right). The number on the top left corner indicates the Earth Mover's Distance between the current point cloud and the target point cloud of the dough. The target dough shape is overlayed in the policy's execution.

BC Policy + Diff-Reset (ours)

SAC

CEM

Sep-Reset

Learn-Reset

No-Reset

2-Reset

3-Reset

Appendix

dough_manipulation_appendix.pdf

Acknowledgments

This material is based upon work supported by the National Science Foundation under Grant No. IIS-2046491, IIS-1849154 and LG Electronics. We thank Zhiao Huang for helping us with the simulator and thank Tim Angert and Sarthak Shetty for helping us with real world experiments.

BibTeX