2022 CoG RoboMaster EP AI Challenge

With the development of intelligent technology, robots are gradually expanding from the traditional industrial field to a wider range of application scenarios, where it is becoming increasingly important to enable robots to perform accurate navigation in complex unstructured and dynamically-changing environments. Recently, Deep Learning (DL) has greatly enhanced the perception ability of the robotic system, and Deep Reinforcement Learning (DRL) and other learning-based methods have brought new opportunities for dealing with these problems. However, most learning-based methods depend on the simulation environment. How to transfer the algorithms trained in the simulation environment to the physical system has become the key to solving the problem.

The goal of the competition is to find good sim2real methods. In the competition task, the robot needs to find 5 targets whose positions are randomly generated in the maze, and complete the activation in a specified order. In 3 minutes, the agent which finds 5 targets and activates them at the fastest and safe speed wins. The competition will be divided into two stages: in the first stage, the agent only needs to complete the above tasks in the simulation environment. We will select the agents with higher scores to participate in the second stage. In the second stage, we deploy the participating agents to the physical robot and test them in a real environment. At this stage, the agent needs to deal with the difference between the simulated robot actuator and the physical actuator, and the state difference caused by a different environment. The agent can adjust the model according to the feedback data and results.

For example, in the figure on the left, the blue rectangle represents a robot born in any free space of the field. Its goal is to find five targets (represented by five stars with different colors in the figure) and activate them in order from 1 to 5. Different visual tags are pasted on each target to identify the activation order. The red rectangle represents the hostile agent, whose initial state is silent. After 5 targets are correctly activated, the agent wakes up, searches, and shoots the robot. The white rectangle and diamond in the field are obstacles. The robot needs to reduce the collision as soon as possible.

Evaluation Metric

Score=60×N+A×0.5(D+H)-T-10K

N is the number of targets successfully activated, A is whether the enemy robot is activated, D is the damage to the hostile agent, H is the remaining HP, T is the time taken (in seconds), and K is the number of collisions. If it is a continuous collision, K = 2 × 𝑇𝑘, 𝑇𝑘 is the continuous collision time (in seconds).

Tracks

  • Track 1: complete the information-based track.

The team robot can obtain camera images, points from LiDAR, pose in the map, remaining HP and remaining bullets of the self-robot and enemy robot, the goal positions and activation flag, and collision information. The participants develop algorithm to output the speed command and the shooting command for the team robot.


  • Track 2: image-based track

The team robot only obtains the image at the current time, the goal positions and activation flag, remaining HP and bullets of the self-robot and the enemy robot, as well as collision information. The participants develop the algorithm to output the speed command and the shooting command for the team robot.

Organizers

Haoran Li, Institute of Automation, Chinese Academy of Sciences

Yaran Chen, Institute of Automation, Chinese Academy of Sciences

Shasha Liu, Institute of Automation, Chinese Academy of Sciences

Lingze Zeng, School of Engineering and Applied Science, University of Virginia

Bopei Zheng, College of Robotics, Beijing Union University

Dongbinz Zhao, Institute of Automation, Chinese Academy of Sciences

Qianli Ma, SZ DJI Technology Co., Ltd