Examples from the dataset are shown in the adjacent video. We provide two versions of the VAL dataset - one with low-res images (1.4 GB) and one with high-res images (162 GB). The data quantity and format is the same between these two versions; the difference is only the image observation quality.
The smaller dataset, with 48x48x3 images which can be used for eg. offline RL, is available for direct download: https://drive.google.com/file/d/1UuWANkVtWLg4egIK2LB_YCKuF87rMQ1H/view?usp=sharing
The larger dataset, with 480x640x3 which might be preferred for eg. representation learning, is available at this Google drive folder: https://drive.google.com/drive/folders/1kD9kyP7-RlIrSnuN7rpEASAGWp5qnNov?usp=sharing
To download the larger dataset, we suggest using https://rclone.org/
Data folders
The data is sorted into several folders. There are a total of 300 files and 2500 trajectories.
fixed_drawer - Human-controlled demonstration data opening and closing drawers. (~10%)
fixed_pnp - Human-controlled demonstration data picking up objects. (~10%)
fixed_pot - Human-controlled demonstration data interacting with a pot and a lid. (~10%)
fixed_tray - Human-controlled demonstration data picking up objects and placing it in a tray. (~10%)
general - Further human-controlled demonstration data collected with the most diversity and variation. (~40%)
onpolicy_eval - Evaluation data collected by an RL policy. (~10%)
onpolicy_expl - Exploration data collected by an RL policy. (~10%)
Data format
Each folder contains a numpy file with several trajectories. An example code snippet to load the data is provided below.
For the smaller 48x48x3 image dataset, the "image_observation" is transposed and flattened, so it can be reshaped as (3, 48, 48) and used as input into convnets in machine learning libraries.