Benchmark

In this page, we bring detailed explanations for our benchmark of robotics manipulation tasks.  Our benchmark includes eight typical and diverse industrial-level robotics manipulation tasks. The manipulator model used in all tasks is the Franka Emika Panda, a state-of-the-art robotic arm widely used for various manipulation tasks in research and industry. It is equipped with seven DoFs, a high-resolution torque sensor in each joint, and an advanced control system that allows for precise and smooth motion. The selected tasks cover a broad range of scenarios, including rigid body manipulation, soft/deformable object manipulation, non-prehensile manipulation, etc. Each task requires a unique set of control strategies and presents a varying level of difficulty, ranging from simple to complex, to test the performance and adaptivity of AI controllers. 

A detailed introduction to the presented benchmark is given as follows.

Point Reaching

Cube Stacking

Peg-in-Hole

Ball Balancing

Ball Catching

Ball Pushing

Door Opening

Cloth Placing

Tasks

AI Software Controllers

We employ various DRL algorithms, such as Trust Region Policy Optimization (TRPO), Deep Deterministic Policy Gradient (DDPG), Soft Actor-Critic (SAC), Proximal Policy Optimization (PPO), and Twin-Delayed Deep Deterministic (TD3), as the AI software controller of the robotic manipulator. These algorithms are based on the implementation given in the SKRL library. The AI controller takes the states of the manipulator and the object to be manipulated as input and generates the control command for each robot joint.  For each task, we design a specific reward function for training the AI controllers. The details about the reward functions can be found in our souce code,  described in the function calculate_metrics() of each corresponding task file (Gym_Envs/Tasks/).

Learning Environments

To facilitate training and evaluation, we wrap all tasks within the Omniverse Isaac Gym Reinforcement Learning Environment, which is built on top of the OpenAI Gym framework. This provides better compatibility with Isaac Sim simulations and other DRL libraries and allows for easy extensions in the future.

Initial Configurations

By treating each manipulation task as an independent entity, we consider the initial configuration of either the object to be manipulated or the target object as the input signal to the system. During the training process of the AI controller, we vary the initial configuration to evaluate and test its performance. The allowable range of initial configurations for each task used in this paper is specified in the following Table. All position values represent the relative Cartesian distances to the base of the manipulator.