TrustDeHands: A Massively Parallel Benchmark For Safe Dexterous Manipulation

Abstract

Safe Reinforcement Learning (Safe RL) aims to maximize expected total rewards meanwhile avoiding violating safety constraints. Although a plethora of safety-constrained environments have been developed to evaluate Safe RL methods, most of them focus on navigation tasks, which are rather simple and have non-trivial gap with real-world applications. For robotics studies, dexterous manipulation is becoming ubiquitous; however, the idea of safe dexterous manipulations are rarely studied in robotics applications. In this paper, we propose TrustDeHands, a massively parallel benchmark for Safe RL studies on safe dexterous manipulation tasks. TrustDeHands is built within the Isaac Gym, a GPU-level parallel simulator that enables highly efficient RL training process. To stay close to real world settings, TrustDeHands offers multi-modal visual inputs, including RGB, RGB-D and point cloud, and supports a variety of arms and dexterous hands from different brands. Moreover, TrustDeHands provides a solid implementation of eight popular safe policy optimization algorithms; this facilitates trustworthy validation for Safe RL methods outside navigation tasks. TrustDeHands include a myriad of challenging tasks that require safety awareness (e.g., Jenga). Results on these tasks show that Safe RL methods can achieve better performance than classical RL algorithms, indicating the effectiveness of Safe RL in safe robot manipulation tasks. To our best knowledge, TrustDeHands is the first benchmark targeting at safe dexterous manipulation. We expect this benchmark to consistently serve as a reliable evaluation suite for future Safe RL developments and further promote the integration between the lines of research of Safe RL and dexterous manipulation.

Demo

Hand Over Wall

Jenga

Pick Bottles

Jenga with arms