ai-CPS-Robotics - Benchmark

Benchmark

In this page, we bring detailed explanations for our benchmark of robotics manipulation tasks. Our benchmark includes eight typical and diverse industrial-level robotics manipulation tasks. The manipulator model used in all tasks is the Franka Emika Panda, a state-of-the-art robotic arm widely used for various manipulation tasks in research and industry. It is equipped with seven DoFs, a high-resolution torque sensor in each joint, and an advanced control system that allows for precise and smooth motion. The selected tasks cover a broad range of scenarios, including rigid body manipulation, soft/deformable object manipulation, non-prehensile manipulation, etc. Each task requires a unique set of control strategies and presents a varying level of difficulty, ranging from simple to complex, to test the performance and adaptivity of AI controllers.

A detailed introduction to the presented benchmark is given as follows.

Point Reaching

Cube Stacking

Peg-in-Hole

Ball Balancing

Ball Catching

Ball Pushing

Door Opening

Cloth Placing

Tasks

Point Reaching (PR): The robot needs to reach a specific point in 3D Cartesian space by using its end-effector. This task is a fundamental and essential functionality that a manipulator should possess.
Cube Stacking (CS): In this task, the robot needs to grasp a cube-shaped object and accurately place it on top of another cube-shaped object that serves as the target. This pick-and-place like task is a common task in industrial settings, e.g., assembly lines.
Peg-in-Hole (PH): The robot needs to accurately insert a cylindrical object into a corresponding hole. This task is commonly used in manufacturing processes, such as circuit board assembly.
Ball Balancing (BB): The objective of this task is to balance a ball at the center of a tray that is held by the robot's end-effector. This task is useful for applications such as stabilizing a moving platform.
Ball Catching (BC): The robot needs to catch a ball that is thrown to it using a tool. This task demands the capability to handle moving objects.
Ball Pushing (BP): The robot is required to push a ball towards a target hole on a table. As a typical non-prehensile manipulation, this task is important in many industrial applications, such as material handling and conveyor systems.
Door Opening (DO): The robot attempts to open a door using its gripper. This is a more advanced manipulation task and requires a multi-stage control process.
Cloth Placing (CP):} The robot has to move and place a piece of cloth onto a target table. As a soft object manipulation task, it often requires a more complex controller than rigid body object manipulations.

AI Software Controllers

We employ various DRL algorithms, such as Trust Region Policy Optimization (TRPO), Deep Deterministic Policy Gradient (DDPG), Soft Actor-Critic (SAC), Proximal Policy Optimization (PPO), and Twin-Delayed Deep Deterministic (TD3), as the AI software controller of the robotic manipulator. These algorithms are based on the implementation given in the SKRL library. The AI controller takes the states of the manipulator and the object to be manipulated as input and generates the control command for each robot joint. For each task, we design a specific reward function for training the AI controllers. The details about the reward functions can be found in our souce code, described in the function calculate_metrics() of each corresponding task file (Gym_Envs/Tasks/).

Learning Environments

To facilitate training and evaluation, we wrap all tasks within the Omniverse Isaac Gym Reinforcement Learning Environment, which is built on top of the OpenAI Gym framework. This provides better compatibility with Isaac Sim simulations and other DRL libraries and allows for easy extensions in the future.

Initial Configurations

By treating each manipulation task as an independent entity, we consider the initial configuration of either the object to be manipulated or the target object as the input signal to the system. During the training process of the AI controller, we vary the initial configuration to evaluate and test its performance. The allowable range of initial configurations for each task used in this paper is specified in the following Table. All position values represent the relative Cartesian distances to the base of the manipulator.

Google Sites

Report abuse