MoDitector: Module-Directed Testing for Autonomous Driving Systems
MoDitector: Module-Directed Testing for Autonomous Driving Systems
This website provides the supplementary materials for the paper "MoDitector: Module-Directed Testing for Autonomous Driving Systems", the source code is public with anonymous git "https://github.com/Shanicky-RenzhiWang/MoDitector"
Testing Autonomous Driving Systems (ADS) is crucial forcensuring their safety, reliability, and performance. Despite numerous testing methods available that can generate diverse and challenging scenarios to uncover potential vulnerabilities, these methods often treat ADS as a black-box, primarily focusing on identifying system failures like collisions or near-misses without pinpointing the specific modules responsible for these failures. Understanding the root causes of failures is essential for effective debugging and subsequent system repair. We observed that existing methods also fall short in generating diverse failures that adequately test the distinct modules of an ADS, such as perception, prediction, planning and control.
To bridge this gap, we introduce MoDitector, the first root-cause-aware testing method for ADS. Unlike previous approaches, MoDitector not only generates scenarios leading to collisions but also showing which specific module triggered the failure. This method targets specific modules, creating test scenarios that highlight the weaknesses of these given modules. Specifically, our approach involves designing module-specific oracles to ascertain module failures and employs a module-directed testing strategy that includes module-specific feedback, adaptive seed selection, and mutation. This strategy guides the generation of tests that effectively provoke module-specific failures. We evaluated MoDitector across four critical ADS modules and four testing scenarios. The results demonstrate that our method can effectively and efficiently generate scenarios where errors in targeted modules are responsible for ADS failures. It generates 216.7 expected scenarios in total, while the best-performing baseline detects only 79.0 scenarios.
Our approach represents a significant innovation in ADS testing by focusing on identifying and rectifying module-specific errors within the system, moving beyond conventional black-box failure detection.