Benchmarking Offline Reinforcement Learning on Real-Robot Hardware
Abstract
Learning policies from previously recorded data is a promising direction for real-world robotics tasks, as online learning is often infeasible. Dexterous manipulation in particular remains an open problem in its general form. The combination of offline reinforcement learning with large diverse datasets, however, has the potential to lead to a breakthrough in this challenging domain analogously to the rapid progress made in supervised learning in recent years. To coordinate the efforts of the research community toward tackling this problem, we propose a benchmark including: i) a large collection of data for offline learning from a dexterous manipulation platform on two tasks, obtained with capable RL agents trained in simulation; ii) the option to execute learned policies on a real-world robotic system and a simulation for efficient debugging. We evaluate prominent open-sourced offline reinforcement learning algorithms on the datasets and provide a reproducible experimental setup for offline reinforcement learning on real systems.
Two dexterous manipulation tasks on the TriFinger platform
Push task: Footage of the expert policy used during data collection.
Lift task: Footage of the expert policy (trained and deployed with low-pass filtering on the actions) used during data collection.
Training, data collection and benchmarking pipeline
Datasets
The datasets are available via the trifinger_rl_datasets package which also provides a gym environment of the simulated TriFinger platform. Similarly to D4RL, the datasets are downloaded automatically when requested via the `get_dataset` method:
import gymnasium as gym
import trifinger_rl_datasets
env = gym.make("trifinger-cube-push-sim-expert-v0")
dataset = env.get_dataset()
All datasets are also available as versions with camera images. See the documentation for further details on how to use the 34 datasets.
Get access to a cluster of real TriFinger robots
A cluster of TriFinger platforms is hosted at the Max Planck Institute for Intelligent Systems in Tübingen, Germany, and is available for remote use. To request access for a research project, write an email to the contact person listed at the TriFinger website and describe the type and scope of the experiments you want to run. You will then be provided with an account that enables you to submit jobs to the TriFinger cluster as if you were using a compute cluster.