Bridge Data: Boosting
Generalization of Robotic Skills
with Cross-Domain Datasets
Frederik Ebert*, Yanlai Yang*, Karl Schmeckpeper, Bernadette Bucher, Georgios Georgakis, Kostas Daniilidis, Chelsea Finn, Sergey Levine
NOTE: The new website for the Bridge Data can be found here and contains the latest dataset and code.
Main Contributions
A broad and extensible ‘bridge’ dataset, which allows us to bridge the gap between tasks and domains.
Our dataset is collected using a low-cost yet versatile 6-DoF WidowX250 robot arm and contains 7,200 demonstrations of a robot performing 71 kitchen tasks across 10 environments with varying lighting, robot positions, and backgrounds.
A detailed empirical study of how this dataset can boost task and scene generalization.
In experiments on 10 unseen kitchen tasks that were not included in the bridge data, we find that joint training with the bridge dataset improves task performance by 2x, resulting in a 50% success rate when using joint training, compared to a 22% success rate when only training on single-task single-domain data. These results suggest that accumulating and reusing diverse offline bridge datasets, including our open-source dataset, may enable future projects to unlock broad generalization without requiring extensive new data collection.
Example trajectories in the dataset, as well as examples of successful and failed rollouts in the experiments, can be downloaded here.
Our Multi-Task Multi-Domain Dataset
Toy Kitchen 1
Toy Kitchen 2
Toy Kitchen 3
Toy Kitchen 4
Toy Sink 1
Toy Sink 2
Toy Sink 3
Toy Sink 4
Toy Sink 5
Real Kitchen1
Experiments
Scenario 1: Transfer with Matching Behaviors
In the first scenario, we analyze whether jointly training with the bridge data and target domain data for tasks that occur both in the target and bridge data improves generalization performance. This closely resembles a common “domain adaptation setting”.
Task: Turn the Lever
Task: Put Pot into Sink
Scenario 2: Zero-shot Transfer with Target Support
A different transfer scenario arises if the task that we would like to run is *not* in target domain data, but only in the bridge data.
To facilitate transfer we assume that we have data from a small number of target domain tasks, but no data for the task we would like to transfer to the target domain.
Task: Put Sweet Potato in Pot
Task: Put Carrot on Plate
Scenario 3: Boosting Generalization of New Tasks
In the last scenario, we test exactly the opposite from the previous scenario: Does the bridge data enable boosting generalization for a new task in the target domain data but not in the bridge data? In particular, we test if we can use a single task in the target domain and jointly train with the bridge data to boost performance.
Task: Put Brush in Pot
Task: Put Pear in Bowl
Task: Wipe Plate with Sponge
Scenarios where training with bridge data does not provide gains
In order to qualitatively demonstrate the usability boundaries of our bridge dataset, here we also show a few tasks where joint training with bridge data did not provide performance gains.
Scenario 1: Transfer with Matching Behaviors
Task: Turn Lever Vertical to Front
Using Bridge Data: 0/10; Not Using Bridge Data: 0/10
Potential reasons for no gain: The faucet in this toy kitchen is very different from the faucet in the other toy kitchens. This is a very delicate task requiring a lot of precision.
Scenario 2: Zero-shot Transfer with Target Support
Task: Put Spoon into Pot
Using Bridge Data: 0/10
Potential reasons for no gain: The spoon only appears in one environment (toy sink 1) in only 50 demos in the prior data. It is also arguably harder to grasp.
Task: Pick up pan from Stove
Using Bridge Data: 0/10
Potential reasons for no gain: The dark blue background of the stove in the toy sink 1 environment is visually very different from other backgrounds in the dataset.
Scenario 3: Boosting Generalization of New Tasks
Task: Flip Orange Pot Upright
Using Bridge Data: 5/10; Not Using Bridge Data: 6/10
Potential reasons for no gain: There is no orange pot in the prior dataset, the flip pot task in the prior dataset involves a metal pot.
Task: Take Lid off Pot
Using Bridge Data: 6/10; Not Using Bridge Data 6/10
Potential reasons for no gain: There are only 100 demos involving a lid in the prior dataset. Only 50 of them involve pot as well.
Task: Open Box Flap
Using Bridge Data: 1/10; Not Using Bridge Data 1/10
Potential reasons for no gain: Boxes are not contained in the prior data and have very different visual appearances from the other objects present in the dataset. The task is also quite different from the most prevalent pick-and-place tasks in the dataset.
Task: Put Blue Pen into Drawer
Using Bridge Data: 0/10; Not Using Bridge Data 0/10
Potential reasons for no gain: The tool chest environment we experimented with in this task has a saturated red-colored background, as well as many objects with red parts, which is visually too different from the environments of the prior data.
Table summarizing scenarios of gain and no gain through the use of bridge dataset. Scenarios, where joint training with the prior data does not improve performance over training on a single task, are marked in red font. The examples in this table are intended to give an impression of the boundaries of where transfer gains occur and where they do not occur and are not an exhaustive list of all the experiments.
Conclusion
We have shown that a multi-task and multi-domain bridge dataset can be leveraged to boost policy generalization and transfer to a target domain (even if it is very different).
Therefore anyone who uses robot setups within the distribution of the bridge dataset should be able to leverage our dataset to improve policy generalization.
We have confirmed our hypothesis that data from loosely related tasks and environments can help policy generalization marking a significant advance in data reuse for robotics.