Machine Learning Architecture for Aircraft Maneuvering

[2019-ongoing]

The goal of this project is to design and develop a controller to enable aerobatic maneuvering of a Fixed-Wing aircraft using a reinforcement learning techniques. Recently, there have been significant advances in refining reinforcement learning based controllers to operate over the continuous state and action space. However, the task at hand has a large continuous, multidimensional domain which makes it difficult for the learning agent (Fixed-Wing aircraft) to select an efficient action in a given state. 

This project investigates several reinforcement learning algorithms for this purpose: Deep Q-Networks, Deep Deterministic Policy Gradient and Normalized Advantage Function. After comparing and contrasting these different approaches,we selected the Normalized Advantage Function approach as the best available option to overcome this difficulty. Additionally, the Normalized Advantage Function algorithm has been integrated with an Inverse Reinforcement Learning technique to extract the reward function and policy of an expert executing four different maneuvers. Incorporating Inverse Reinforcement Learning auto-determines a reward function from expert pilot flight data and has essentially made the algorithm generic which provides an extendable framework for other maneuvers, given expert data. This project documents the investigative process of selecting a reinforcement learning technique as well as the life cycle development and integration of the Normalized Advantage Function algorithm with a Max-Min inverse reinforcement learning formulation which lead to the successful demonstrations of the Slow Roll and Knife-Edge maneuvers in simulation.

Principal Investigators list

Related Publications

Sponsors