AI-Enabled CPS Benchmark

This website provides the supplementary materials of the paper "When Cyber-Physical Systems Meet AI: A Benchmark, an Evaluation, and a Way Forward", which presents detailed research workflow and experiment results not shown in the paper due to the page limit.

The website is organized as follows:

  • Home page: The motivation why an industrial-level AI-enabled CPS benchmark is urgently needed, which is followed by an illustration and an introduction of our research workflow.

  • Empirical Study: We present the process of the subject CPS collection, namely, how we search, filter, and select candidate subject CPS from multiple sources.

  • AI-CPS Construction: In this section, we introduce how we use the Reinforcement learning Toolbox in MATLAB to construct RL controllers as substitutions of traditional controllers. We also present the training procedures of Deep Reinforcement Learning (DRL) controllers, including environment setup, reward function design, termination logic, agent selection, and training configurations.

  • Benchmark: This section contains all subject CPS we collected for benchmarking with their environment introductions, functionality descriptions, control system explanations, specifications, and image illustrations.

  • RQ1: AI Performance: In this section, we conduct a comprehensive performance evaluation of both traditional and AI-based controllers, in a way to obtain a better understanding of their advantages and limitations, respectively. This section also contains the universal evaluation metrics we used and in-detailed experimental results.

  • RQ2: Falsification: This section introduces how different falsification tools perform when being applied to AI-enabled CPS. The effectiveness of falsification tools has been analyzed via experimental results.

  • RQ3: Hybrid Control: In this section, we introduce the hybridization of controllers through multiple combination methods. This early exploration intends to investigate whether the hybrid control system is promising in the context of AI-enabled CPS.

  • Concrete Example: One system (Adaptive Cruise Control) has been selected as an example to show the detailed design, training, evaluation, and hybridization workflow.

  • Summary: We make a summarization of the discussions, challenges, and opportunities of this work.


Cyber-physical systems (CPS) have been broadly deployed in safety critical domains, such as automotive systems, avionics, medical devices, etc. In recent years, Artificial Intelligence (AI) has been increasingly adopted to control CPS. Despite the popularity of AI-enabled CPS, few benchmarks are publicly available. There is also a lack of deep understanding on the performance and reliability of AI-enabled CPS across different industrial domains. To bridge this gap, we initiate to create a public benchmark of industry-level CPS in seven domains and build AI controllers for them via state-of-the-art deep reinforcement learning (DRL) methods. Based on that, we further perform a systematic evaluation of these AI-enabled systems with their traditional counterparts to identify the current challenges and explore future opportunities.

Our key findings include (1) AI controllers do not always outperform traditional controllers, (2) existing CPS testing techniques (falsification, specifically) fall short of analyzing AI-enabled CPS, and (3) building a hybrid system that strategically combines and switches between AI controllers and traditional controllers can achieve better performance across different domains. Our results highlight the need for new testing techniques for AI-enabled CPS and the need for more investigations into hybrid CPS systems to achieve optimal performance and reliability.

Research Workflow

Workflow summary of AI-enabled CPS dataset and benchmark construction, and high-level empirical study design

As illustrated in the figure above, our empirical study follows three steps: 1. Subject CPS collection, 2. AI-CPS construction, 3. Evaluation and benchmarking. In the first part, we collect and filter the systems from a large number of subject CPS. The details of sources and selection criteria are available in Empirical Study and the introduction about the system dynamics, controllers, functionalities, and requirements is provided in Benchmarks.

In the second step, after analyzing the system requirements and structures, we deploy DRL controllers on each system with multiple agent types. This part includes the design of the reward function, configurations of DNN, agent settings, and training setup. The explanation and an example are available in AI-CPS Construction and Concrete Example.

In the third step, we evaluate the AI-enabled CPS from two aspects: essential property evaluation and falsification. A comprehensive evaluation of system functionalities, reliabilities, and reachabilities can be analyzed and explained in this step. Detail experimental results and findings are listed in RQ1. Following the evaluation of properties, we also investigate the effectiveness of the existing falsification tools on these AI-enabled CPS, resulted in RQ2. Hence, with the comprehensive understanding about the controller properties, we initiate an early step exploration on the hybrid control systems where we combine the AI controllers and traditional controllers to reach better capabilities. The hybrid process and results are detailed in RQ3.