System architecture for the MH-MR surveillance task
We developed a team-based surveillance system involving two human operators and six multi-robot platforms, as shown in the above image. The hardware configuration used for the operators in this user experiment is identical and is mainly comprised of the APM, which serves as the main interfaces for reading physiological signals and behavioral features, and measuring the objective measurement of the CWL. The subjective and objective measurements of CWL are then utilized as inputs for AWAC to reallocate workload. In addition to the main interfaces, there are additional sub-interfaces as outlined below that were used for conducting the team-based surveillance mission during the user experiment.
In the AMT, raw physiological signals from wearable biosensors are collected at a 100 Hz sampling rate via ros2-foxy-wearable-biosensors package [1], while behavioral features from the facial view are extracted at a 30 Hz sampling rate.
For the physiological signals, we used two off-the-shelf wearable biosensors, the Empactica E4, and Emotiv Insight, to collect physiological data:
Emotiv Insight: readings of 5-channel EEGs, power spectrum (theta, alpha, beta, and gamma), performance metrics, and motion data.
Empatica E4: readings of blood volume pulse (BVP), galvanic skin response (GSR), heart rate (HR), inter-beat interval (IBI), skin temperature (ST), and motion data.
For the behavioral data, we used a webcam to record behavioral data:
Intel RealSense: extracting various features from the facial camera views, such as eye aspect ratio (EAR), facial action units, and facial expressions.
APM is to predict cognitive loads from the objective measurement of physiological and behavioral signals, which can range from low to medium to high. We adopt the Husformer [2], an end-to-end multimodal transformer framework for the recognition of multimodal human cognitive load, for building the APM. To make predictions of objective cognitive loads of operators, the APM uses the multimodal bio-signals collected through AMT as the input, and predicts the objective cognitive load levels, i.e., low, medium, and high, as the outputs at 100 Hz.
The CCTV GUI Program is the direct interface between the participant and the multi-robot system for conducting the surveillance task, as shown in the right image below. The CCTV GUI program displays multiple windows of the camera views and team scores while performing the task. Then, the GUI enters the main experiment step where the participant monitors multiple camera views simultaneously, called the CCTV monitoring task, to find abnormal objects on the screens. Additionally, only the team scores are displayed on the GUI to reduce peer and time pressures.
Ir performs the CCTV monitoring task of detecting abnormal objects on the streaming video from the multi-robots displayed on the GUI program. The ODS checks whether the human operator detects abnormal or normal objects and provides audio feedback to the human operator based on the results of the ODS.
The mission score server (MSS) manages the rewards (+1 point) and penalty (-3 points) based on the participant's performance in the CCTV monitoring task. The MSS works in conjunction with the ODS and GUI program during task execution.
The MRS consists of six ROS2-based multi-robot platforms, known as SMARTmBOT [4]. The MRS consists of six ROS2-based multi-robot platforms, which is an open-source mobile robot platform. The robot's locations are tracked by the Vicon motion capture system, which uses reflective markers attached to the top of the robots. To perform the surveillance mission, a pure-pursuit control algorithm was employed, allowing the robot to repetitively travel between the start and goal position.
Here is a supplementary video to introduce the SMARTmBOT: https://youtu.be/uniaTWcCeDM
A diagram of the software architecture used in the SMARTmBOT
Potential applications using the SMARTmBOT: (a) a color changing SMARTmBOT case using RGB LED strips, (b) a miniature robot arm mounted on the SMARTmBOT, (c) an example of the rendezvous algorithm as swarm research, and (d) an example of the leader-follower algorithm as HRI research. A demo video that shows each of these applications is available in the supplementary video.
The CCTV GUI Program is the direct interface between the participant and the multi-robot system for conducting the surveillance task, as shown in the right image below. The CCTV GUI program displays multiple windows of the camera views and team scores while performing the task. Then, the GUI enters the main experiment step where the participant monitors multiple camera views simultaneously, called the CCTV monitoring task, to find abnormal objects on the screens. Additionally, only the team scores are displayed on the GUI to reduce peer and time pressures.
To estimate the human performance from subjective and objective cognitive loads
Finding best human performance based on the Yerkes-Dodson law
and allocate workloads based on the human operator’s performance
Relationship between CWLs and performance based on Yerkes-Dodson Law
Between ISA and mission score (=performance) (red line)
Between the predicted cognitive load and mission score (green line)
Predict human performance using both relationships
Prediction of the next CWL based on workload transitions
From results of the Pearson’s coefficient (right image),
ISA / NASA-TLX: Moderate positive correlation with #cam
Predict next ISA and prediction results based on changed workloads (#cam) (∆𝑐)
Predict next performance based on the changed ISA and prediction results
State space S, Action space A, and Reward R,
Using PPO, we trained the DRL model on the environment to obtain the optimal policy π.
Algorithm of the DRL Learning Model
using OpenAI gym environment
Policy optimization
Trained by Proximal Policy Optimization (PPO)
Total number of samples = 1M
Converged into about 0.94 rewards
[1] Wonse Jo, Robert Wilson, Jaeeun Kim, Steve McGuire, and Byung-Cheol Min, "Toward a Wearable Biosensor Ecosystem on ROS 2 for Real-time Human-Robot Interaction Systems", 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Workshop on HMRS 2021: Cognitive and Social Aspects of Human Multi-Robot Interaction, Prague, Czech Republic, Sep 27 – Oct 1, 2021. (PDF, Video, GitHub) [Excellent Paper Award]
[2] Ruiqi Wang*, Wonse Jo*, Dezhong Zhao, Weizheng Wang, Baijian Yang, Guohua Chen, and Byung-Cheol Min (* equal contribution), "Husformer: A Multi-Modal Transformer for Multi-Modal Human State Recognition", IEEE Transactions on Cognitive and Developmental Systems, Early Access, 2024. (PDF, GitHub)
[3] Wonse Jo, Go-Eum Cha, Dan Foti, and Byung-Cheol Min, "SMART-TeleLoad: A New Graphic User Interface to Generate Affective Loads for Teleoperation", SoftwareX, Vol. 26, 101757, May 2024. (PDF, Video, GitHub)
[4] Wonse Jo, Jaeeun Kim, Ruiqi Wang, Jeremy Pan, Revanth Krishna Senthilkumaran, and Byung-Cheol Min, "SMARTmBOT: A ROS2-based Low-cost and Open-Source Mobile Robot Platform", arXiv preprint, arXiv:2203.08903, 2022. (PDF, Video, GitHub)