Dataset

We open-sourced 3 sets of data from our project: 

Example Raw Videos

Videos from the dataset depicting the task from each of the four cameras in our system (1 global side camera, 1 global top camera, and 2 wrist cameras).

Example Processed Images

wrist45_image

wrist225_image

side_image

top_image

Routing Primitive Offline Dataset

The data we collected for our low-level routing policy includes 1442 expert demonstration trajectories of the routing task with around 20 transitions each. The trajectories start with the cable held in the gripper and end once it has been routed through the corresponding clip. 

Dataset Layout

There is a folder for each trajectory. In each trajectory folder, there is a .npy file containing the trajectory info with cropped and downsampled 128*128 images and a videos folder containing 4 full resolution, uncropped mp4 videos of the trajectory from the four camera views. 

The NumPy file contains a dictionary. You will have to use .item() to retrieve the dictionary upon loading the NumPy files.

The keys of the dictionary are as follows:

High-Level Primitive Selection Offline Dataset:

The offline data we share includes 11915 transitions for the high-level policy to train on and select the next primitive to execute. This dataset is the result of a data augmentation scheme where a couple of transitions before and after the timestep when the primitive is selected gets labeled with the same primitive and history. The trajectories contain state and processed image observations from 3 different camera views visualized above. 

Dataset Layout

The data is formatted as a dictionary of the following key-value pairs:

End-to-End Trajectory Dataset

The data we collected for our high-level policy includes 257 expert demonstration trajectories of the multi-stage task with varying number of transitions per trajectory. It includes data for full one, two, and three clip tasks. The action is inputted and executed at a frequency of 5Hz in 4 DoF (translation in xyz directions and rotation in the z-axis in the end-effector frame) plus gripper open/close. The trajectories contain state and image observation from 4 different camera views exampled above. 

Dataset Layout

There is a folder for each trajectory. In each trajectory folder, there is a .npy file containing the trajectory info with cropped and downsampled 128*128 images and a videos folder containing 4 full resolution, uncropped mp4 videos of the trajectory from the four camera views.

The NumPy file contains a dictionary. You will have to use .item() to retrieve the dictionary upon loading the NumPy files.

The keys of the dictionary are as follows: