In this section, we discuss the challenges in our experiments and opportunities for future work.
In our work, we deploy AI controllers on 9 CPS to construct AI-enabled CSP where the design and training processes of DRL-based AI controllers are non-trivial in the whole construction workflow. A well-defined reward function is crucial to accurately describe the controller functionalities and system requirements. That is, the reward function not only gives positive feedbacks while the agent operates on the right track to minimize the errors but also penalizes the agent if some specific values are beyond their thresholds. A well-designed reward function guides the agent to maximize the expectation of the long-term reward. If a reward function incorporates multiple signals, we must consider the relative sizes of the signals and scale their contributions to the reward signal accordingly.
If we design a reward function that is directly taken from the system requirements, we will get a sparse function in which the rewards are only separated as 0 and 1 on certain areas. However, a sparse function is easy to specify but hard to be solved by agents, and vice versa for non-sparse functions. Thus, some strategies are needed to make the problem easier to solve, specifically, we can reshape a spare reward function into a non-sparse reward function, where the techniques and methodologies in shaping are challenging.
In addition, from the experimental results in WT and APV, we find if there are a lot of system requirements that need to be satisfied simultaneously or a standalone AI controller has multiple outputs, AI controllers are not likely to provide comprehensively good performance. We consider that If a reward function incorporates multiple signals, we must consider the relative sizes of the signals and scale their contributions to the reward signal accordingly. It is necessary to trade between goals based on the environmental context.
In RQ2, we apply multiple existing falsification algorithms on the AI-enabled CPS, and we notice that non of them can consistently perform well on all systems. Thus, the falsification that is based on a black-box model and guided by logical semantics may not work effectively for AI-enabled CPS. AI controllers have their specific structure and unique decision-making logic, specifically, which is quite different from their traditional counterparts. Specifically, the falsification can trace the system outputs to estimate the behaviour of the controller and search the inputs that can lead to violations of requirements. However, the DNN inside our DRL controllers do not share the same acting strategies as the traditional controllers, thus the existing falsification algorithms may not still be effective on AI controllers with DNN embedded.
We take an early-stage exploration on hybrid control systems in RQ3, we combine the traditional controllers and AI controllers with relatively fundamental methods. The ideal method of combination is to take into account the characteristics of controllers and switch the control authority among different controllers based on the environmental conditions. Just like the design of reward functions, systematic combining methodologies are necessary to achieve the superior performances.
Based on the understandings and challenges from our work, we propose some opportunities that may contribute to the development of AI-enabled CPS.
As we establish the first A-enabled CPS benchmark, we encourage more research efforts on benchmarks and empirical studies in this direction. A large number of industrial-level examples can help to guide the research directions, reflect the industrial standards, and obtain in-depth understandings of the characteristics of AI-enabled CPS. Furthermore, systematic methodologies are needed on constructing the AI controllers that some advanced learning algorithms and reward function strategies can help to develop AI-enabled CPS.
The existing falsification tools are not fully effective in detecting system requirement violations. This points out that new methodologies are needed to exploit the structure of DNN inside AI controllers to achieve better testing performance. Specifically, the new testing tools should be capable to break up the black-box structure of DNN to extract effective information from DNN and guide the testing progress to detect any defections.
Moreover, analysis techniques are also needed as a further step of testing which can help to figure out how and why the violations occur from the system perspective. Finally, from testing and analysis steps, we understand where, why and how a fault is generated during the operations, we can try to repair the system exclusively to the detected defects. The repair process should keep the intrinsic characteristics of AI controllers while correcting the improper behaviors on specific situations.