Controllable swarm animation using deep reinforcement learning with a rule-based action generator
Zong-Sheng Wang* Chang Geun Song* Jung Lee* Jong-Hyun Kim** Sun-Jeong Kim*
(* : Hallym University, ** : Kangnam University)
IEEE Access 2022
Zong-Sheng Wang* Chang Geun Song* Jung Lee* Jong-Hyun Kim** Sun-Jeong Kim*
(* : Hallym University, ** : Kangnam University)
IEEE Access 2022
Abstract : The swarm behavior in nature is a fascinating and complex phenomenon that has been studied extensively for decades. Visually natural swarm animation can be produced by the state-of-the-art rulebased method; however, it still suffers from the drawbacks of low control accuracy and instability in swarm behavior quality when controlled by the user. In this study, we propose a deep reinforcement learning (DRL) based approach to generate swarm animation that reacts to real-time user control with high quality. A rulebased action generator (RAG) adapted to the actor-critic DRL method is presented to enhance DRL’s action exploration strategy. Various practical dynamic reward functions are also designed for DRL to train agents by rewarding swarm behaviors and penalizing misbehavior. The user controls the swarm by interacting with the swarm’s leader agent, for example by directly changing its speed or orientation, or by specifying a path consisting of waypoints. The second aim of this study is to improve the scalability of the trained policy. We introduce a new state observation quantity of DRL called the embedded features of swarm (EFS) as a state observation of DRL, allowing the trained policy scaling to a more extensive system than it has been trained on. In the experiments, four different scenarios have been designed to evaluate the control accuracy and quality of the generated swarm behavior by metrics and visualization. Additionally, the experiment has compared the performance of the proposed dynamic reward functions with fixed reward functions. Experimental results show that the proposed approach outperforms state-of-the-art methods in terms of swarm behavior quality and control accuracy.
[paper]