RoboSet
Introduction
Introducing RoboSet, a large-scale real-world multi-task dataset collected across a range of everyday household table-top activities. RoboSet consists of a mix of kinesthetic demonstrations as well as teleoperated demonstrations. The dataset consists of multi-task activities with a variation in the scene at every demonstration to induce visual diversity in the data. RoboSet is organized along two verticles based on the physics backend -- simulation, or real-world.
- RoboSet (sim)
RoboSet provides datasets consisting of both expert and human trajectories accompanying a subset of its simulated environments. More details on the sim dataset are provided on the link on the right.
2. RoboSet (real)
In addition to the simulated datasets, RoboHive also accompanies a comprehensive collection of real-world datasets. RoboSet(real) has been collected with a focus to fuel diversity and generalization in RobotLearning. Next, we outline various characteristics of the RoboSet
Source of dataset
The dataset has been collected from multiple sources, such as kinesthetic demonstrations and teleoperation. Kinesthetic demonstrated data was collected by playing back a demo trajectory with a variated scene every rollout. The teleoperated data using Robohive's built-in teleOperation support using an Oculus Quest 2 controller.
Camera viewpoints
We collected the dataset over multiple camera views, this way we ensure variety in the data and are not bound to a fixed camera viewpoint. Refer to Figure 2 for examples of the camera setups.
Different tasks
We collected data over a range of tasks and activities, this makes schematic tasks that we can associate and relate to. This would also open up investigation opportunities in multi-step multi-task agents. Opportunities for language-guided sequencing and generalization. Refer to Figure 3 for a full table of tasks.
Different Scenes
We collected data over a range of different scene setups and camera viewpoints. We showcase two of our collection scenes below.
Data Schema
Data Access
import h5py
filename = <Path to h5 data>
h5 = h5py.File(filename, 'r')
h5.keys() #Outputs Trials per h5, Trial 0, 1, 2, ...
h5['Trial0'].keys() #outputs data, derived, and config
#to extract the data
h5[trial]['data'][data_key] #where data_key is one of the cells from the data tab