We open-sourced 3 sets of data from our project:
Routing Primitive Offline Dataset: 1442 low-level routing trajectories that can be used to train the routing policy
High-Level Primitive Selection Offline Dataset: 11915 transitions with the relevant observations used in our system to train the high-level primitive selection policy
End-to-End Trajectory Dataset: 257 end-to-end task demonstration trajectories that route the cable through all the clips on the board
Example Raw Videos
Videos from the dataset depicting the task from each of the four cameras in our system (1 global side camera, 1 global top camera, and 2 wrist cameras).
Example Processed Images
wrist45_image
wrist225_image
side_image
top_image
Routing Primitive Offline Dataset
The data we collected for our low-level routing policy includes 1442 expert demonstration trajectories of the routing task with around 20 transitions each. The trajectories start with the cable held in the gripper and end once it has been routed through the corresponding clip.
Dataset Layout
There is a folder for each trajectory. In each trajectory folder, there is a .npy file containing the trajectory info with cropped and downsampled 128*128 images and a videos folder containing 4 full resolution, uncropped mp4 videos of the trajectory from the four camera views.
The NumPy file contains a dictionary. You will have to use .item() to retrieve the dictionary upon loading the NumPy files.
The keys of the dictionary are as follows:
observations/tcp_pose - This is a list containing the tcp pose for each transition in the trajectory
observations/gripper - This is a list containing the gripper state (1 for close, 0 for open)
observations/wrist45 - This is a list containing the 128 x 128 images for the wrist45 camera for each transition in the trajectory
observations/wrist225 - This is a list containing the 128 x 128 images for the wrist225 camera for each transition in the trajectory
observations/top- This is a list containing the 128 x 128 images for the top camera for each transition in the trajectory
observations/side- This is a list containing the cropped 128 x 128 images for the side camera for each transition in the trajectory
actions- This is a list containing normalized 4DoF robot actions in cartesian space velocity (xyz translation and z rotation) for each transition in the trajectory
High-Level Primitive Selection Offline Dataset:
The offline data we share includes 11915 transitions for the high-level policy to train on and select the next primitive to execute. This dataset is the result of a data augmentation scheme where a couple of transitions before and after the timestep when the primitive is selected gets labeled with the same primitive and history. The trajectories contain state and processed image observations from 3 different camera views visualized above.
Dataset Layout
The data is formatted as a dictionary of the following key-value pairs:
robot_state: np.ndarray((11915, 7)) Contains the tcp pose of the robot at each transition
wrist45_image: np.ndarray((11915, 128, 128, 3)) Contains the wrist45 view images in NumPy format
wrist225_image: np.ndarray((11915, 128, 128, 3)) Contains the wrist225 view images in NumPy format
side_image: np.ndarray((11915, 128, 128, 3)) Contains the cropped side view images in NumPy format
primitive_sequence: np.ndarray((119195, 6)) Contains a sequence of the past 6 primitives executed on the same clip, with the most recent at the end and the earliest at the beginning. The sequence resets to all zeros once the Go Next primitive has been called. The numbers correspond to the primitives below:
0: Padding
1: Pickup
2: Route
3: Perturb
labels: np.ndarray((11915, )) Contains the primitive to be executed next. The number corresponds to the primitive below:
0: Pickup
1: Route
2: Perturb
3: Go Next
End-to-End Trajectory Dataset
The data we collected for our high-level policy includes 257 expert demonstration trajectories of the multi-stage task with varying number of transitions per trajectory. It includes data for full one, two, and three clip tasks. The action is inputted and executed at a frequency of 5Hz in 4 DoF (translation in xyz directions and rotation in the z-axis in the end-effector frame) plus gripper open/close. The trajectories contain state and image observation from 4 different camera views exampled above.
Dataset Layout
There is a folder for each trajectory. In each trajectory folder, there is a .npy file containing the trajectory info with cropped and downsampled 128*128 images and a videos folder containing 4 full resolution, uncropped mp4 videos of the trajectory from the four camera views.
The NumPy file contains a dictionary. You will have to use .item() to retrieve the dictionary upon loading the NumPy files.
The keys of the dictionary are as follows:
observations/tcp_pose - This is a list containing the tcp pose for each transition in the trajectory
observations/gripper - This is a list containing the gripper state (1 for close, 0 for open)
observations/wrist45 - This is a list containing the 128 x 128 images for the wrist45 camera for each transition in the trajectory
observations/wrist225 - This is a list containing the 128 x 128 images for the wrist225 camera for each transition in the trajectory
observations/top - This is a list containing the 128 x 128 images for the top camera for each transition in the trajectory
observations/side - This is a list containing the cropped 128 x 128 images for the side camera for each transition in the trajectory
actions - This is a list containing normalized 4DoF robot actions in cartesian space velocity (xyz translation and z rotation) plus binary gripper action (-1 for close, 1 for open)for each transition in the trajectory
All data is provided under the Creative Commons Attribution 4.0 International License.