Session VII (May 16, 1:30pm-3:00pm): Experimental Design in Transportation Studies, organized by Feng Guo
Title: Kernel Method Based Case Sampling for Safety Verification on Cyber-Physical System
Speaker: Chen Qian, Virginia Tech
Abstract: Testing the safety of Cyber-Physical Systems (CPS), such as automated driving systems, is crucial for advancing their widespread adoption. However, the associated high costs of implementing safety tests demand a thoughtful strategy in selecting appropriate cases. This paper investigates the selection of such cases, aiming to simultaneously satisfy two essential criteria: representativeness and criticality. Representativeness evaluates how well selected cases fit the distributions of real-world situations, while criticality assesses whether the selected cases can cover rarely observed safety-critical scenarios. In pursuit of these objectives, we introduce a novel subsampling method called Kernel Case Sub-sampling (KTCS). To cater to diverse testing targets from different stakeholders, we propose two algorithmic versions: deterministic sampling (KTCS-D), which is well-suited for reproducible standard tests, and stochastic sampling (KTCS-C),which is specifically designed to ensure that each round of safety tests comprises different cases while jointly sharing similar characteristics. Numerical simulations validate the superior performance of our methods concerning representativeness and criticality compared to alternative approaches. We implement our method on data from the Second Strategic Highway Research Program Naturalistic Driving Study. Evaluation metrics consistently affirm the superior performance of our approach in selecting cases. Our proposed methods address a critical challenge in the domain of engineering: how to conduct appropriate tests on CPS. These methods hold the potential to accelerate the development of safe CPS. Our work also has broader implications for case sub-sampling, particularly in ensuring that the selected points not only cover the entire data space but also exhibit representativeness toward the original dataset.