SoftGym: Benchmarking Deep Reinforcement Learning for Deformable Object Manipulation

Xingyu Lin, Yufei Wang, Jake Olkin, David Held

Conference on Robot Learning (CoRL), 2020

Manipulating deformable objects has long been a challenge in robotics due to its high dimensional state representation and complex dynamics. Recent success in deep reinforcement learning provides a promising direction for learning to manipulate deformable objects with data driven methods. However, existing reinforcement learning benchmarks only cover tasks with direct state observability and simple low-dimensional dynamics or with relatively simple image-based environments, such as those with rigid objects. In this paper, we present SoftGym, a set of open-source simulated benchmarks for manipulating deformable objects, with a standard OpenAI Gym API and a Python interface for creating new environments. Our benchmark will enable reproducible research in this important area. Further, we evaluate a variety of algorithms on these tasks and highlight challenges for reinforcement learning algorithms, including dealing with a state representation that has a high intrinsic dimensionality and is partially observable. The experiments and analysis indicate the strengths and limitations of existing methods in the context of deformable object manipulation that can help point the way forward for future methods development.

paper link: http://arxiv.org/abs/2011.07215

code link: [SoftGym (contains environments)] [SoftAgent (contains benchmarked algorithms)]

SoftGym Medium

Summary Video

Benchmarked Algorithms

We benchmark several model-free and model-based RL algorithms, including CEM, SAC, CURL-SAC, DrQ and PlaNet. We show that CEM with ground-truth dynamics performs the best in most tasks. We also show that state based algorithm still outperforms image based methods in many tasks, despite recent progress in data augmentation for image based RL.

Visualization of the learned policy in SoftGym

  1. TransportWater - Transport a cup of water to target position without the water spilling out.

Dynamics Oracle (CEM)

Reduced State Oracle (SAC)

RGB (CURL-SAC)

2. PourWater - Pour all the water into a target cup.

Dynamics Oracle (CEM)

Reduced State Oracle (SAC)

RGB (CURL-SAC)

3. StraightenRope - Straighten the rope from random configurations.

Dynamics Oracle (CEM)

Reduced State Oracle (SAC)

RGB (CURL-SAC)

4. SpreadCloth - Spread the cloth from crumpled configurations.

Dynamics Oracle (CEM)

Reduced State Oracle (SAC)

RGB (CURL-SAC)

5. FoldCloth - Fold the cloth in half.

Dynamics Oracle (CEM)

Reduced State Oracle (SAC)

RGB (CURL-SAC)

6. DropCloth - Drop the cloth such that it is laid spread on the floor.

Dynamics Oracle (CEM)

Reduced State Oracle (SAC)

RGB (CURL-SAC)

Simulator realism on cloth tasks

We perform a total of 5 pick and place actions in both the SoftGym cloth environments and on the real-world, with the pick and place positions randomly selected. For the real experiments, the Sawyer with the Weisss gripper uses top-down pinch grasp to simulate the picker behaviour. We can see that, the simulated cloth show similar behaviour to the real one, with slight difference in details as the parameters (frictions of the table and cloth, mass of the cloth, etc.) does not match perfectly.

Acknowledgement

This work was supported by the United States Air Force and DARPA under Contract No. FA8750-18-C-0092, the National Science Foundation under Grant No. IIS-1849154, and LG Electronics.

Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation, or United States Air Force and DARPA.