Learning Closed-loop Dough Manipulation Using a Differentiable Reset Module
Carl Qi, Xingyu Lin, David Held
Robotics and Automation Letters (RA-L) with presentation at the International Conference on Intelligent Robots and Systems (IROS), 2022
[Paper (arXiv)] [Presentation] [Code]
Abstract
Deformable object manipulation has many applications such as cooking and laundry folding in our daily lives. Manipulating elastoplastic objects such as dough is particularly challenging because dough lacks a compact state representation and requires contact-rich interactions. We consider the task of flattening a piece of dough into a specific shape from RGB-D images. While the task is seemingly intuitive for humans, there exist local optima for common approaches such as naive trajectory optimization. We propose a novel trajectory optimizer that optimizes through a differentiable "reset" module, transforming a single-stage, fixed-initialization trajectory into a multistage, multi-initialization trajectory where all stages are optimized jointly. We then train a closed-loop policy on the demonstrations generated by our trajectory optimizer. Our policy receives partial point clouds as input, allowing ease of transfer from simulation to the real world. We show that our policy can perform real-world dough manipulation, flattening a ball of dough into a target shape.
Presentation
![](https://www.google.com/images/icons/product/drive-32.png)
System overview
(a) Our trajectory optimizer implements a differentiable module that resets the tool to avoid local optima and allows gradient information to propagate through the entire trajectory. (b) We perform imitation learning on the demonstration data generated by the trajectory optimizer. Our policy takes segmented point cloud observations as input. (c) We use the learned policy to control a sawyer robot to perform closed-loop rolling in the real world.
Dough rolling with a sawyer arm
Real world setup
Quantitative performances
Distance to target: 0-3cm
Final views and performances
Executions
SAC
Open-loop
Heuristic
Diff-Reset-BC (ours)
![](https://www.google.com/images/icons/product/drive-32.png)
Human
![](https://www.google.com/images/icons/product/drive-32.png)
Distance to target: 3-6cm
Final views and performances
Executions
SAC
Open-loop
Heuristic
Diff-Reset-BC (ours)
![](https://www.google.com/images/icons/product/drive-32.png)
Human
![](https://www.google.com/images/icons/product/drive-32.png)
Distance to target: 6-9cm
Final views and performances
Executions
SAC
Open-loop
Heuristic
Diff-Reset-BC (ours)
![](https://www.google.com/images/icons/product/drive-32.png)
Human
![](https://www.google.com/images/icons/product/drive-32.png)
Dough size: small (240 g)
Final views and performances
Executions
SAC
Open-loop
Heuristic
Diff-Reset-BC (ours)
![](https://www.google.com/images/icons/product/drive-32.png)
Human
![](https://www.google.com/images/icons/product/drive-32.png)
Dough size: medium (360 g)
Final views and performances
Executions
SAC
Open-loop
Heuristic
Diff-Reset-BC (ours)
![](https://www.google.com/images/icons/product/drive-32.png)
Human
![](https://www.google.com/images/icons/product/drive-32.png)
Dough size: large (480 g)
Final views and performances
Executions
SAC
Open-loop
Heuristic
Diff-Reset-BC (ours)
![](https://www.google.com/images/icons/product/drive-32.png)
Human
![](https://www.google.com/images/icons/product/drive-32.png)
Dough rolling in simulation
There are 10 configurations above, each separated by boundries in gray. Every configuration contains the execution of the BC policy (left) and the expert demonstration from Diff-Reset (right). The number on the top left corner indicates the Earth Mover's Distance between the current point cloud and the target point cloud of the dough. The target dough shape is overlayed in the policy's execution.
BC Policy + Diff-Reset (ours)
SAC
CEM
Sep-Reset
Learn-Reset
No-Reset
2-Reset
3-Reset
Appendix
![](https://www.google.com/images/icons/product/drive-32.png)
Acknowledgments
This material is based upon work supported by the National Science Foundation under Grant No. IIS-2046491, IIS-1849154 and LG Electronics. We thank Zhiao Huang for helping us with the simulator and thank Tim Angert and Sarthak Shetty for helping us with real world experiments.