Hyper-SAMARL: Hypergaph-based Coordinating Task Allocation and Socially-aware Navigation in Multi-Human Multi-Robot System
Weizheng Wang, Aniket Bera, and Byung-Cheol Min
Submitted to ICRA 2025
Weizheng Wang, Aniket Bera, and Byung-Cheol Min
Submitted to ICRA 2025
Abstract
A team of multiple robots seamlessly and safely working in human-filled public environments requires adaptive task allocation and socially-aware navigation that account for dynamic human behavior. Current approaches struggle with highly dynamic pedestrian movement and the need for flexible task allocation. We propose Hyper-SAMARL, a hypergraph-based system for multi-robot task allocation and socially-aware navigation, leveraging multi-agent reinforcement learning (MARL). Hyper-SAMARL models the environmental dynamics between robots, humans, and points of interest (POIs) using a hypergraph, enabling adaptive task assignment and socially-compliant navigation through a hypergraph diffusion mechanism. Our framework, trained with MARL, effectively captures interactions between robots and humans, adapting tasks based on real-time changes in human activity. Experi mental results demonstrate that Hyper-SAMARL outperforms baseline models in terms of social navigation, task completion efficiency, and adaptability in various simulated scenarios.
Architecture of Hyper-SAMARL
The framework of Hyper-SAMARL is composed of four blocks: First, task information related to points of interest (POIs) and robot observations areencodedby thepositional embedding encoder of the transformer as spatial-temporal input. Next, the hypergraph for MR-TASN is initialized using attention-based vertex features and Euclidean based hyperedge features. The hypergraph diffusion mechanism is then employed to propagate vertex features across the hypergraph structure, ensuring balanced hypergraph dynamics. Finally, the diffused hypergraph features are decoded by the robot policy to generate macro-actions (MA) and local actions(LA), which are trained using the MAPPO algorithm.
Framework of Spatial-Temporal Transformer and Multi-Modal Transformer [1]
Spatial-Temporal Transformer and Multi-Modal Transformer neural network framework: (a) Spatial Transformer leverages a multi-head attention layer and a graph convolution network along the time-dimension to represent spatial attention features and spatial relational features; (b) Temporal Transformer utilizes multi-head attention layers to capture each individual agent’s long-term temporal attention dependencies; and (c) Multi-Modal Transformer fuses heterogeneous spatial and temporal features via a multi-head cross-modal transformer block [2] and a self-transformer block [3] to abstract the uncertainty of multimodality crowd movements.
Hyper-SAMARL Algorithm
We present Hyper-SAMARL, a framework for adjustable task allocation and coordinated socially-aware navigation with multi-robots using MARL and hypergraph neural networks. Hyper-SAMARL leverages a hypergraph neural network to optimize both a dynamic adaptable task allocation MA strategy and a social navigation LA planner, trained by MAPPO. Our results from simulations tests affirm its effectiveness, advancing multi-robot navigation.
Simulation Scenario
To address the MR-TASN (multi-robot task allocation and socially-aware navigation) task, we designed several simulated scenarios where a group of robots is assigned to search for individual target POIs in a human-filled environment. Formally, the MR-TASN task is modeled as a Dec-POSMDP, which couples the execution of temporally-extended macro actions for task allocation with the generation of primitive actions (LAs) for social navigation.
Learning Curve
The ablation models, (rondom task allocation) RTA-SAMARL replaced the MAactor-critic network in Hyper-SAMARL with the RTA task allocation strategy, while keeping the same training parameters for socially-aware navigation planner, specifically the LA actor-critic network, as used in Hyper-SAMARL.
MLP-SAMARL removes the Hypergraph neural network in Hyper-SAMARL and employs a multi-layer perceptron (MLP)-based actor-critic network for both task allocation and path planning.
More Testcase Visualization
Dynamic Task Allocation Adaptation
In this configuration, points-of-interests (POIs) are assigned to agents in real-time based on current conditions, availability, and system demands. The allocation is adaptive and can change during the task execution, allowing the system to respond to varying workloads, agent performance, and environmental changes. The dynamic task allocation for a multi-robot system is complex because it refers to the overall system decision and allocation. This approach is common in dynamic or unpredictable environments where flexibility is crucial, especially for highly capricious pedestrians environment.
We illustrate that the feasible task allocation ability and huamn-robot interaction (HRI) inference estimation not only can avoid potential higher collision risk areas, but also perform a better time efficiency of navigation.
POI Generation Illustration
Video
References
[3] Vaswani, Ashish, et al. "Attention is all you need." Advances in neural information processing systems 30 (2017).