AI-MSF Benchmark

This website provides supplementary materials for the paper "Benchmarking Robustness of AI-enabled Multi-sensor Fusion Systems: Challenges and Opportunities" , which presents detailed research workflow, corruption qualitative visualizations, and experiment results not shown in the paper due to the page limit. 

The website is organized as follows: 

Abstract

Multi-Sensor Fusion (MSF) based perception systems have been the foundation in supporting many industrial applications and domains, such as self-driving cars, robotic arms, and unmanned aerial vehicles. Over the past few years, the fast progress in data-driven artificial intelligence (AI) has brought a fast-increasing trend to empower MSF systems by deep learning techniques to further improve performance, especially on intelligent systems and their perception systems. Although quite a few AI-enabled MSF perception systems and techniques have been proposed, up to the present, limited benchmarks that focus on MSF perception are publicly available. Given that many intelligent systems such as self-driving cars are operated in safety-critical contexts where perception systems play an important role, there comes an urgent need for a more in-depth understanding of the performance and reliability of these MSF systems. 

To bridge this gap, we initiate an early step in this direction and construct a public benchmark of AI-enabled MSF-based perception systems including three commonly adopted tasks (i.e., object detection, object tracking, and depth completion). Based on this, to comprehensively understand MSF systems’ robustness and reliability, we design 14 common and realistic corruption patterns to synthesize large-scale corrupted datasets. We further perform a systematic evaluation of these systems through our large-scale evaluation and identify the following key findings: 

(1) existing AIenabled MSF systems are not robust enough against corrupted sensor signals;
(2) small synchronization and calibration errors can lead to a crash of AI-enabled MSF systems;
(3) existing AI-enabled MSF systems are usually tightly-coupled in which bugs/errors from an individual sensor could result in a system crash;
(4) the robustness of MSF systems can be enhanced by improving fusion mechanisms. 

Our results reveal the vulnerability of the current AI-enabled MSF perception systems, calling for researchers and practitioners to take robustness and reliability into account when designing AI-enabled MSF.

Research Workflow

Workflow summary of AI-enabled MSF benchmark construction, and high-level empirical study design 

As illustrated in the figure above, our empirical study follows three steps: 1. Subject MSF collection, 2. Corruption patterns design, 3. Evaluation and benchmarking. In the first part, we collect and filter the systems from a large number of papers. The details of sources and selection criteria are available in Empirical Study, and the introduction to system architecture and application tasks is provided in Benchmarks.

In the second step, we leverage fourteen common corruption patterns to synthesize realistic corrupted data to evaluate MSF systems' robustness. These corruption patterns can be naturally grouped into three categories: weather corruption, sensor corruption, and sensor misalignment. The design details and visualization are available in Corruption Patterns.

In the third step, we evaluate the robustness performance of AI-enabled MSF from three aspects:  corrupted signals, sensor misalignment, and system coupled. First, we investigate the potential risks of AI-enabled MSF systems against corrupted signals. Next, we evaluate the sensitivity of AI-enabled MSF to spatial and temporal misalignment. Then, we investigate how AI-enabled MSF systems would behave when partially or completely losing a source of signals. Finally, based on our findings on RQ1-3, we investigate the unique advantages of each fusion mechanism and potential opportunities of improving the robustness of AI-enabled MSF systems. Detailed experimental results and findings are listed in Research Questions.