Chain Reaction Tool Environment (CREATE)

Description

CREATE is a 2D physics based environment where the agent must reason about how to place tools in real time to move the blue ball to the goal (green star). The action space consists of both selecting which tool to place and where to place it. The tools available to the agent have varied characteristics and visualizations of them are found in this page. The tool observations are characteristic data about each tool used for zero-shot generalization to tools the policy has not used before. These action observations are given by the environment and used to learn tool representations. The CREATE environment consists of 12 tasks. Results for all tasks and visualizations of the learned tool representations are in the main results section.

Tasks

Push

Obstacle

Seesaw

Basket

Belt

Buckets

Cannon

Collide

Funnel

Ladder

Moving

Navigate

Available Tools

Below are the various types of tools available to an agent. An agent must select which tool to place and where to place it, all in real time. Our environment consists of 2,111 distinct tools obtained by varying the parameters of tools from each tool class shown below.

Ramp

Lever

Cannon

Funnel

Polygon

Belt

Trampoline

See Saw

Fan

Bucket

Bouncy Polygon

Ball

Tool Observations

The tool obsevations illuminate the characteristics of a tool for the agent to incorporate into its decision making. These tool observations are obtained through probing the tool by launching a ball at the tool from a random angle, speed and offset. The tool's characteristics can be inferred from the deflection trajectory of the ball. As a tool can have diverse functionalities, we launch the probe ball at the tool 1024 times. Each ball trajectory is 7 long. In our paper we consider learning both from the ball's (x,y) position trajectory and from raw images.

Ramp Tool Observation

Cannon Tool Observation

Bucket Tool Observation