CausalWorld: A Robotic Manipulation Benchmark for Causal Structure and Transfer Learning

Abstract

Despite recent successes of reinforcement learning (RL), it remains a common problem that agents fail to transfer their learned skills to related environments. To facilitate research addressing this challenge, we propose CausalWorld, a benchmark for causal structure and transfer learning in a robotic manipulation environment. This environment is a simulation of an open-source robotic platform, hence offering the possibility of sim-to-real transfer. Tasks consist of constructing 3D shapes from a given set of blocks - inspired by how children learn to build complex structures. The key strengths of CausalWorld is that it provides a combinatorial family of such tasks with a common causal structure and underlying factors (including e.g. robot and object masses, colors, sizes). The user (or even the agent) may intervene on all causal variables, which allows for fine-grained control over how similar different tasks (or task distributions) are. Hence, one can easily define training and evaluation distributions of a desired difficulty level, targeting a desired form of generalization (e.g. only changes in appearance or object mass). Further, this common parametrization facilitates defining curricula by interpolating between an initial and a target task. While users may define their own task distributions, we present eight meaningful distributions as concrete benchmarks, ranging from simple to extremely challenging, all of which require long-horizon planning and precise low-level motor control at the same time. Finally, we provide baseline results for a subset of these tasks on distinct training curricula and corresponding evaluation protocols, verifying the feasibility of the tasks in this benchmark.

About

Do Interventions

  • CausalWorld improves upon previous benchmarks by exposing a large set of high level variables in the causal generative model of the environments, such as properties of blocks, goals, robot links and others like gravity.

  • The possibility to intervene on any of these environment variables at any point in time allows one to generically set up training environments in a curriculum manner and evaluate agents across different generalization axes using a broad set of evaluation protocols.

  • Furthermore, emphasizing the real-world relevance of this benchmark as opposed to earlier ones , researchers may build their own real-world platform of this simulator at low cost,, and transfer their trained policies to the real world.

  • Finally, by releasing this benchmark we hope to facilitate research in causal structure learning, i.e. learning the causal graph or certain aspects of it, as we operate in a complex real-world environment whose dynamics follow the laws of physics which induce causal relations between the variables. Changes to the variables we expose can be considered do-interventions on the underlying structural causal model (SCM). Consequently, we hope that this benchmark offers an exciting opportunity to investigate causality and its connection to RL and robotics.

Reaching

Pushing

Picking

Pick And Place

Stacking2

Towers

Stacked Blocks

Creative Stacked Blocks

General

Do Interventions

Curriculum Through Interventions

Block Size Interventions

Goal Guided Generation

Disentangling Generalization

Model Selection

Baseline Experiments

Pushing Baseline (PPO)

2 million time steps

8 million time steps

14 million time steps

Picking Baseline
(PPO)

8 million time steps

20 million time steps

60 million time steps

Pick And Place Baseline
(PPO)

10 million time steps

20 million time steps

50 million time steps

Stacking2 Baseline
(PPO)

10 million time steps

20 million time steps

80 million time steps

Install

  • Install the library: pip install causal_world

  • Check out the tutorials and docs at https://github.com/rr-learning/CausalWorld

Authors

ETH Zurich

Max Planck Institute For Intelligent Systems

Max Planck Institute For Intelligent Systems

Max Planck Institute For Intelligent Systems

Max Planck Institute For Intelligent Systems

Max Planck Institute For Intelligent Systems

Cite CausalWorld

@misc{ahmed2020causalworld,

title={CausalWorld: A Robotic Manipulation Benchmark for Causal Structure and Transfer Learning},

author={Ossama Ahmed and Frederik Träuble and Anirudh Goyal and Alexander Neitz and Manuel Wüthrich and Yoshua Bengio and Bernhard Schölkopf and Stefan Bauer},

year={2020},

eprint={2010.04296},

archivePrefix={arXiv},

primaryClass={cs.RO}

}