Research Agenda

My AFRL research was focused on developing the insights, technology and methods required to enable verifiable, safe and effective collaboration amongst networked air, ground and information systems with humans in the loop. I am interested in the realization of vehicles with autonomous decision-making ability cooperatively performing complex tasks with human operators in an uncertain environment. For the past decade, I have been working on realizing autonomous UAVs at the Control Science Center of Excellence/ Autonomous Controls branch at the Air Force Research Laboratory (AFRL). To illustrate my research, I will touch upon different aspects of my work and its relevance to the general area of interest, i.e., stochastic decision making and solving combinatorial resource allocation optimization problems. My work is also related to other areas of AI research, in particular, automated planning & scheduling (multi-UAV task scheduling & persistent ISR work) and decision support systems (multi UAV-operator optimal collaboration for target identification work). I have also worked on sequential decision making under information uncertainty (dynamic weapon-target assignment work) where the DM has to make optimal decisions with incomplete information on the outcome of past decisions. This work is generalizable to other problem areas e.g., selective maintenance of a series-parallel system, where scarce resources (parts, labor, budget) are shared over a finite horizon. The work is focused on establishing key monotonicity results on the optimal allocation policy which in turn enables development of scalable heuristic (sub-optimal) policies.

Sequential Resource Allocation

In this work, we investigate a dynamic variant of the classical Weapon Target Assignment (WTA) problem, wherein targets are sequentially visited by a bomber carrying homogenous weapons. A weapon launched by the bomber destroys a target with a known probability and upon successful elimination, the bomber earns a random positive reward. The value of the reward is drawn from a fixed known distribution. Moreover, the value is observed and hence, known prior to engagement. We employ a shoot-look-shoot policy in that the bomber, upon deploying a weapon, is notified whether or not it successfully destroyed the target. Whereupon, it decides whether to move on to the next target (in the sequence) or re-engage the current target. We show that a weapon is dropped on a target iff the observed reward is no less than a stage and state dependent threshold value. We also showed that the threshold is monotonic decreasing in the number of weapons and monotonic non-decreasing in the number of targets. We have also extended the result under some mild assumptions to the case of an error-prone Battle Damage Assessment that is subject to both type-I and type-II errors and to a partial information game scenario, where the target can shoot back at the bomber.

Robotic Perimeter Surveillance

This work focused on the optimization of perimeter patrol operations undertaken by UAVs operating in a stochastic environment. This is a base defense scenario where camera enabled UAVs patrol a defended perimeter with incursions detected by Unattended Ground Sensors (UGSs) placed along the perimeter. The UAVs are tasked with patrolling the perimeter and also service alarms that go off when the UGS detects an incursion. The UAVs do not have any sensing or target recognition capability. The sensing is done by the UGSs and remotely located operators looking at video feed, sent by the UAVs, do the target recognition. The UAVs make autonomous decisions as to which portion of the perimeter to patrol next and/or which alarm to service next and for how long, based on its current location and the information gleaned from the UGSs. This problem is posed as a large scale Markov Decision Process (MDP) and as such existing solution methods are not scalable. My collaborators and I have devised a state aggregation-based approximation strategy that yields provably good sub-optimal solutions to this problem. During the course of our study, we also established tractable upper and lower bounding LP methods to more general reward maximization problems. This work resulted in one of six papers selected for a semi-plenary presentation at the 2012 ASME Dynamic Systems & Control Conference. Moreover, two UAV autonomous patrol has been successfully flight tested as part of AFRL’s Intelligent Control & Evaluation of Teams (ICE-T) program at Vandenberg AFB, CA.

Ground Target Isolation

This work involves a pursuit-evasion (differential game) scenario, where an intruder/ ground target traveling on a road network is pursued by UAVs. The novelty (and added complexity) results from the UAVs relying entirely on Unattended Ground Sensors (UGSs) for target location information. Moreover, the UGSs are not connected and so, the information is both local and delayed i.e., it is available only when the UAV visits the vicinity of an UGS. My collaborators and I have devised minimum time capture algorithms for a scenario where the intruder is restricted to a Directed Acyclic Graph (DAG) and have provided scalable sub-optimal solution methods. This work has also been successfully flight tested on multiple occasions in Wilmington Air Park, OH. A key component of the work is the realization of multiple UAVs cooperating in the isolation task. We recently completed a two UAV demonstration at Camp Atterbury, IN, where the UAVs successfully tracked down a ground target (SUV) and video-captured it via on-board cameras. This was a first of its kind demonstration, wherein two UAVs made autonomous decisions (on-board), communicated with ground-based sensors and cooperatively captured a moving ground target with zero human intervention.

Human-Machine Teaming

This work incorporates a novel mixed initiative control system for Intelligence, Surveillance and Reconnaissance (ISR) operations and involves optimal human-machine teaming. The scenario entails a camera equipped UAV sequentially overflying geo-located objects of interest which need to be classified as either a True or False Target by a human operator. The vehicle is allowed a pre-specified number of revisits, such that an object can be looked at, a second time, under better viewing conditions. The over-arching goal is to correctly classify the objects and minimize the false alarm and missed detection rates. We have designed a stochastic controller that computes if and when a revisit is necessary and also the optimal revisit state. The novelty here is that the critical task of detection/pattern recognition is relegated to the human operator, whereas optimal decision-making is entrusted to the machine i.e., the automation can overrule the operator. The stochastic dynamic programming based DM is informed about the performance of the human operator via an empirical human classifier model. Our work involved extensive experiments (a rarity in human factors studies) that helped generate the human model and also validate the performance of the closed loop classification system which is shown to be significantly better than the operator only system.

Industrial Optimization

At GE Global Research, I was involved in two multi-million dollar projects that dealt with optimization; one to do with train optimal control (auto-pilot) for heavy-haul freight trains and the other for optimal positioning (micro-siting) of wind turbines so as to maximize the energy yield. The work resulted in three patents (2 granted, 1 pending) and two successful commercial products, GE Transportation’s game changing Trip OptimizerTM and GE Energy’s WindLAYOUTSM optimization service. I am glad to note that the Trip Optimizer product has generated multi-billion-dollar revenue (to GE) and saved more than billon gallons of fuel to date (news article).

Persistent ISR

This work related to combinatorial optimization problems arising out of persistent ISR operations where multiple autonomous agents are tasked with servicing multiple tasks with different priorities. The problem involves both task scheduling (which agent does what task) and route planning (which agent goes to what location). We have implemented novel Mixed Integer Linear Programming (MILP) and stochastic dynamic programming based scalable methods in real world scenarios involving multiple autonomous UAVs.