An experiment on a simulated virtual drone racing track with Deep RL algorithms to replicate & surpass performance of path planning algorithms.
Long-Term Planning with Deep Reinforcement Learning on Autonomous Drones
FPV drone racing is a sport type where participants control “drones”, equipped with cameras while wearing head-mounted displays showing the live stream camera feed from the drones. Similar to full size air racing the goal is to complete a set course as quickly as possible.
Arxiv link to the paper , Submitted to Association for the Advancement of Artificial Intelligence(AAAI) 2020 Fall Symposium Series
Abstract
In this paper, we study a long-term planning scenario that is based on drone
racing competitions held in real life. We conducted this experiment on a
framework created for "Game of Drones: Drone Racing Competition" at NeurIPS
2019. The racing environment was created using Microsoft's AirSim Drone Racing
Lab. A reinforcement learning agent, a simulated quadrotor in our case, has
trained with the Policy Proximal Optimization(PPO) algorithm was able to
successfully compete against another simulated quadrotor that was running a
classical path planning algorithm. Agent observations consist of data from IMU
sensors, GPS coordinates of drone obtained through simulation and opponent
drone GPS information. Using opponent drone GPS information during training
helps dealing with complex state spaces, serving as expert guidance allows for
efficient and stable training process. All experiments performed in this paper
can be found and reproduced with code at our GitHub repository
PPO Algorithm
PPO is a once state of the art but still powerful Deep RL algorithm. It mostly excels on continuous control problems.
Multi-Gate Checkpoints
Multiple gate checks required specific design to building a gym like RL environment.
Questions?
Contact [ugurkanates97@gmail.com] to get more information on the project