We consider a group of Unmanned Aerial Vehicles (UAVs) flying over a target region to provide communication services to the ground users (left figure above). Instead of a fixed set of UAVs, dynamic UAV crew and user distribution are considered. Proactive self-regulation of the UAV based communication network is investigated with considering such dynamics. We target joint UAV trajectory design and radio resource management strategies, which is capable of identifying the upcoming change, and proactively managing the UAVs whenever the UAV crew or the user distribution is about to change, rather than taking actions passively after the change. The problem is formulated into a constrained non-convex optimization problem and tackled in a deep reinforcement learning (DRL) framework. The state-of-the-art DRL technique, i.e., deep deterministic policy gradient (DDPG), is employed. To improve the performance of the learning convergence, an asynchronous parallel training structure is exploited to increase the exploration and reduce the correlation among sampled experiences within the same batch (right figure above).
The "proactive" self-regulation strategy in the last project is by nature a passive strategy as the network cannot control the change in UAVs or users, but passively accept the upcoming change and make every effort to minimize the loss or maximize the benefit from the change. In this project, we consider a UAV communication network where UAVs are equipped with solar charging capabilities. Leveraging on the time-variabilities of the user traffic demand, certain UAVs can be deliberately selected in one time slot to elevate high in the sky for high-efficiency solar charging when the user traffic is light, even if they are not in bad need for charging. The charging UAVs can be called back later when necessary. In this way, a self-sustainable UAV network can be established, targeting various tasks subject to the task requirements and sustainability constraints.
Collaborator: Dr. Miao Wang, Miami University, OH
In this project, we aim at a prototype multi-drone platform as illustrated above. The platform is composed of a group of drones, each equipped with multiple sensors (e.g., camera, GPS, ultrasonic sensor, etc.) and communication capabilities with other drones. More importantly, each drone has on-board unit which is programmable to conduct autonomous decision making. The drones are expected to make their decisions based on their local observations and the exchanged information with other drones to perform distributed yet coordinated tasks (e.g., maximum ground user coverage). Simple yet effective distributed algorithms will be designed, among which multi-agent Q learning is one promising approach. The project will serve as a valuable compliment to the above simulation-based projects. Demos and posters are provided below to show our current progress.
[Demo 1: The Hovering of a Self-built Drone] Project Done by Yang Xu and Rui Yang
[Demo 2: Tello Drone Face tracking] Project Done by Jie Feng
[Demo 3: Tello Drone Gesture Recognition] Project Done by Jie Feng
[Demo 4: Tello Drone Posture Recognition] Project Done by Jie Feng
[Check out our LATEST Video! Drone Hovering with Multiple Modes] Project Done by Chandra Siddhartha Geddam and Mohammad Hasan
Collaborator: Dr. Xiaopeng Zhao, University of Tennessee, Knoxville
Worldwide, approximately 50 million people lived with dementia in 2018. Reminiscence therapy (RT), the most popular therapeutic intervention for persons with dementia (PwDs), exploits the PwDs’ early memories and experiences, usually with some memory triggers familiar to the PwDs (e.g., photographs, music, or videos), to evoke memory and stimulate conversation so that the declination of their cognitive capabilities can be effectively maintained or deferred. A physically embodied social robot capable of providing non-verbal interactions, is believed to enable more intuitive, effective and engaging memory triggers during RT, thus stimulating more memory recall and conversation. In addition, robot-assisted RT is a promising solution to cope with the increasing number of PwDs and relieve the stress from the caregivers thanks to the dead-set execution and indefatigable repeatability of robots.
In this project, we first established a pervasively applicable simulation model for PwDs to i) integrate a comprehensive list of the major factors impacting the PwD's behaviors during RT, and ii) accurately characterize the probabilistic transitions between the PwD's mental states under different robotic actions. Based on the model, a revised Q learning algorithm was developed to obtain the optimal conversation strategy for the robot to simulate the PwDs to talk to the most extent while keeping them in generally positive mental status. In addition, the strategy features open selections for the patients to continue, stop or change the topic if the PwDs fall into consecutive negative states. This is to let the patients have a sense of control over the RT procedure, thus mitigating the mental stress. The Q learning is revised to incorporate the impact of the patients' choices to the previous state-action valuations.
Collaborator: Dr. Zhijiang Ye, Miami University, OH