Maverick: Multi-Drone Collaboration

Zhehui Huang, Iris (Chui Yi) Liu, Yijing Yang, Yaqi Han

University of Southern California


[EDD] [Paper] [Github]

This website contains supplementary materials for USC CS527 final project.

We train reinforcement learning algorithms for solving collaborative problems with drone swarm.

Drone collaboration is especially important in search & rescue, surveillance & mapping mission and many more tasks in real world. However, most of the previous researches on autonomous drone agents are based on control-based methods, which may not perform robustly in complicated environment and are computationally expensive .

Motivated by the robustness and efficient training of model-free reinforcement learning algorithms, we train an APPO algorithm with mean embedding method for 6 different collaborative tasks.

Same Goal & Circular Configuration Scenarios

The following two scenarios are designed to evaluate collisions among drone agents. The left one is the same goal scenario where all drone agents aim to get as close to the same goal as possible. The right on illustrates the circular configuration scenario where goals are arranged in circular shape and agents need to arrive at the designated goal.

Static Obstacle Scenarios

Static obstacles are added to the simulation environment to increase complexity of the environment. The goal is to examine whether the policy can avoid collisions between drone agents as well as the obstacle object(s).

Digit Pattern Scenarios

The videos below demonstrate multiple drone agents cohesively compose a digit pattern horizontally and vertically.

Collision Avoidance Scenarios (pursuit & evasion, moving obstacle)

The following videos illustrates our trained PPO policy in three different scenarios with obstacles present in the environment. The top left video shows agents pursuing a goal moving in a static Lissajous3D trajectory. The top right video shows the former scenario with a local camera. The bottom right video shows the scenario where agents attempt to get as close to the goal as possible while avoiding the moving obstacle. The bottom left video illustrates the same scenario with a local camera view.

Algorithm

Mean embedding is implemented to build a model that extends the observation space of each drone agent.

Architecture of Mean Embedding.