In this section, we bring detailed explanations for the 5 AI-enabled CPS in our experiment. The system environment, tasks, controller type, I/O, requirements, and functionality are introduced. We aim to provide a better understanding and insights about the CPS that we collected, modified and evaluated.
As an important driving assistance function, ACC has been popularized in the automotive field after decades of development. The ACC system mentioned in this paper is originally from MathWorks [1], and it aims to maintain the safety distance of an ego car from a lead car by adjusting the acceleration of the ego car. When the relative distance is larger than the safe distance, the speed of the ego car will instead maintain at the driver-set velocity. To this end, the whole system takes the acceleration of the lead car a_lead as input and outputs the velocity and the moving distance of the lead car and the ego car.
Traditional Controller
The traditional controller used in this system is model predictive control (MPC). MPC controller computes an optimal acceleration of ego car a_ego by solving the optimization problem at each step over a finite time horizon, while ensuring the safe distance of the two car. Specifically, MPC which takes as input the driver-set velocity v_set, the driver reaction time t_gap between the two cars, the relative distance d_rel to the lead car, the relative velocity v_rel to the lead car, and the velocity v_ego of the ego car, and outputs the acceleration a_ego of the ego car. The prediction time used by MPC is 5 seconds.
DRL Controller
The DRL controller in ACC collects the system environment information to generate observation states. By evaluating the current state and computing a corresponding state value, the DRL controller outputs an acceleration command to the ego car. Unlike MPC, the DRL controller uses a reward function to evaluate the agent performance from two aspects: velocity and distance. While the safety distance d_safe is secured, the ego car should approach the cruise velocity v_set, otherwise, it follows the lead car velocity v_lead to avoid a collision. The reward function penalizes the agent by the violation of the safety distance requirement and rewards the agent based on how close the ego car velocity v_ego is to the target speed.
The figure below shows the Simulink model of ACC with DRL controller. The Simulink model is constructed with three parts. The first part represents lead car dynamics which takes an acceleration signal from the external environment and computes the actual position and speed on the lead vehicle. The second part contains the DRL control system, where an observation block collects the variables from the environment to generate the state information to agents, a reward function block calculates the reward on current time step based on the environment information and system requirements, and a termination block helps to stop the simulation as the proper time. The agent block receives the state information, the reward value, and the termination command and outputs an acceleration command to the cascaded ego car dynamics. The third part is similar to the section at the beginning, but it represents the dynamics of the ego car. The ego car block takes command from the controller and outputs the actual position and speed as feedbacks. The relative distance and velocity of the lead car and the ego car can be measured in this feedback loop.
Evaluation Metrics
S1: Hard Safety. The hard safety of ACC is formulated as follows, saying that during the simulation, the relative distance d_rel, between the two cars should always be larger than a safe distance d_safe.
S1 = □[0,50](d_rel ≥ d_safe)
S2: Soft Safety. The soft safety of ACC is defined from two aspects: 1) MAE — the mean value of the average absolute error that the ego speed v_ego exceeds the set speed v_set. 2) MAXERR — the mean value of the maximum absolute error that the ego speed v_ego exceeds the set speed v_set.
S2 = □[0, 50](v_ego ≤ v_set)
S3: Steady State. S3 aims to measure the stability of the system. In other words, if the whole system keeps the relative distance d_rel greater than the sum of the safe distance d_safe and the tolerance value μ for 40 secs in 50 secs, it can be regarded as satisfying S3. Here we set the tolerance value μ as 0.2.
S3 = ◇[0, 50](□[0, 40](d_rel ≥ d_safe + 0.2))
S4: Resilience. This requires that the system can quickly be back to the steady state S3 after the violation of S3. Specifically, if the system cannot restore within 1 second, we say that it violates S4.
S4 = □[0, 50]((d_rel < d_safe + 0.2) → ◇[0, 1](d_rel ≥ d_safe + 0.2))
S5: Liveness. If the ego car remains motionless, the previous four evaluation metrics will not be violated. However, it is contrary to the needs of drivers and passengers. So the velocity of ego car v_ego should be positive. Here we set the threshold as 1.
S5 = □[0, 50](v_ego ≥ 1)
With the increasing attention paid to the automobile safety, LKA, as an important auxiliary function, has been invested a lot in research and development by automobile industry to improve the driving safety. LKA in this benchmark is collected from MathWorks [2]. A vehicle equipped with a lane-keeping assist (LKA) system has a sensor, such as a camera, that measures the lateral deviation and relative yaw angle between the centerline of a lane and the vehicle. The sensor also measures the current lane curvature and curvature derivative. Depending on the curve length that the sensor can view, the curvature in front of the car can be calculated from the current curvature and curvature derivative. The LKA system keeps the car traveling along the centerline of the lanes on the road by adjusting the front steering angle of the ego car. The whole system takes the longitudinal velocity Vx, the initial lateral deviation e1_initial and initial yaw angle e2_initial as inputs and outputs the real-time lateral deviation of ego car in meters error_1 and the longitudinal axis angle error_2 in radians from the centerline of the lane. The goal for lane keeping control is to drive both lateral deviation error_1 and relative yaw angle error_2 close to zero.
Traditional Controller
The traditional controller equipped in LKA is model predictive control (MPC). The MPC controller computes the optimal steering angle by receiving the curvature, the longitudinal velocity Vx, the lateral deviation error_1 and the relative yaw angle error_2. The prediction horizon used by MPC is 1 second.
DRL Controller
The DRL controllers in LKA take the current errors and derivatives on lateral deviation and yaw angle to generate the observation states. The reward function guides the agent to minimize the errors as soon as possible. The error on lateral deviation dominates the output rewards since it is the key parameter to determine whether the vehicle has drifted to the lane beside. A termination logic is set to stop the simulations if the errors exceed the thresholds, such that, agents can avoid wasting time and computation source on meaningless states.
Evaluation Metrics
S1: Hard Safety. The hard safety of LKA aims to control the lateral deviation error_1 within 0.85m in time interval [0, 15].
S1 = □[0, 15](|error_1| ≤ 0.85)
S2: Soft Safety. The soft safety of LKA includes two aspects: 1) MAE — the mean value of the average absolute error that error_1 and compares to zero. 2) MAXERR — the mean value of the maximum absolute error that relative yaw angle error_2 compares to zero.
S2 = □[0, 15]((|error_1| = 0) ∧ (|error_2| = 0))
S3: Steady State. The steady state of LKA is designed to evaluate the stability of the system. Concretely, if the absolute value of the lateral deviation error_1 exceeds a predefined value μ for 10 secs in 15 secs, it will be considered as violating S3. Here we set the predefined value μ as 0.5.
S3 = ◇[0, 15](□[0, 10](|error_1| ≤ 0.5))
S4: Resilience. Once the steady state is violated, whether it can quickly return to the steady state becomes a necessary metric to evaluate the resilience of the system. The time it takes to return to the steady state is set as 1 second.
S4 = □[0, 15]((|error_1| > 0.5) → ◇[0, 1](|error_1| ≤ 0.5))
S5: Liveness. It is common sense that the car should travel along the centerline of the lanes and move forward at a default speed. So the velocity of the car v should be positive in the whole process. Here we set the predefined threshold as 1.
S5 = □[0, 15](|v| ≥ 1)
As a representative system of assisted driving, the automatic parking system has enormous potentiality and chance in the global automobile market. This system mentioned here is collected from MathWorks [3]. The system shows how to generate a reference trajectory and track the trajectory for a parking valet using nonlinear MPC. The parking garage in this example contains a vehicle and eight static obstacles. The obstacles are given by six parked vehicles, a reserved parking area, and the garage border. This system takes the initial position of the ego vehicle (pos_x, pos_y) as input and output the speed v and the steering angle angle of the ego vehicle in real-time. The goal of the ego vehicle is to park at a target pose without colliding with any of the obstacles. The reference point of the ego pose is located at the center of the rear axle.
Traditional Controller
As in traditional linear MPC, nonlinear MPC calculates control actions at each control interval using a combination of model-based prediction and constrained optimization. Different from the linear MPC, the prediction model can be nonlinear with time-varying parameters and the equality and inequality constraints also can be nonlinear. The traditional controller obtains the current position, the reference position, the current velocity cur_v and the current steering angle cur_angle as the inputs and generates the speed v and the steering angle angle in real time.
DRL Controller
In APV, DRL controllers need to output the speed and steering commands to the ego vehicle to keep it moving long the reference trajectory. The observation generation block takes the current and reference position of the vehicle and the current speed and steering angels to form up the state information. The reward function evaluates the vehicle performance based on the deviation of x, y positions and the yaw angle. Similar to LKA, the errors in positions take more weight while computing the reward at each time step. The DRL controllers aim to minimize the position and yaw angle errors during the entire simulation to park the ego cat at the target spot without any collisions.
Evaluation Metrics
S1: Hard Safety. The hard safety of APV is to maintain the lateral deviation from reference trajectory error_1 within 1m in time interval [0, 12].
S1 = □[0, 12](|error_1| ≤ 1)
S2: Soft Safety. In an optimal situation, the lateral deviation from reference trajectory error_1 and the yaw angle error error_2 should always be zero. Like LKA, we measure both MAE and MAXERR in order to scientifically evaluate the performance of different controllers.
S2 = □[0, 12]((|error_1| = 0) ∧ (|error_2| = 0))
S3: Steady State. S3 stipulates that APV maintains the error_1 less that 0.5 for 10 seconds. If it fails, that means S3 has been violated.
S3 = ◇[0, 12](□[0, 10](|error_1| ≤ 0.5))
S4: Resilience. Once S3 is not satisfied, S4 becomes a metric to measure the ability of being back to the steady state. Here the recovery time is set as 1.
S4 = □[0, 12]((|error_1| > 0.5) → ◇[0, 1](|error_1| ≤ 0.5))
S5: Liveness. Similarly to LKA, the velocity of ego vehicle v needs to be positive according to the commonsense. The minimum value used
S5 = □[0, 12](|v| ≥ 0.1)
AFC is a complex air-fuel control system released by Toyota [6]. The whole system takes two input signals from the outside environment, PedalAngle and EngineSpeed, and outputs μ = |AF - AF_ref|/AF_ref, which is the deviation of the air-to-fuel ratio AF from a reference value AF_ref. By changing the PedalAngle and EngineSpeed, the fuel controller should adjust the intake gas rate to the cylinder to maintain optimal air-to-fuel ratio. The goal of this system is to control the deviation μ no more than a predefined threshold.
Traditional Controller
The original control system consists of two parts: (1) a PI controller, and (2) a feed-forward controller. The former regulates the air-to-fuel ratio AF in a closed-loop, using the measured AF to compute the fuel command. The latter estimates the rate of airflow into the cylinder by measuring the inlet air mass flow rate.
DRL Controller
The DRL controller in AFC gathers information about engine dynamics and outputs a fuel command to achieve the reference AF ratio. A termination unit is embedded in the training process to prohibit the agent from exploring pointless states. It uses a reward function to guide the agent to reduce the deviation mu. Specifically, a positive reward is given based on how smaller mu is and negative feedback is generated if mu exceeds a certain threshold. A small penalty is added based on the DRL action value from the last time step to acquire a stable control output.
Evaluation Metrics
S1: Hard Safety. The formula of S1 is as follows, where μ, the deviation of the air-to-fuel ratio AF from a reference value AF_ref, should always be less than 0.2. Here AF_ref is a constant 14.7.
S1 = □[0, 30](μ ≤ 0.2)
S2: Soft Safety. The soft safety of AFC gives a further evaluation about the peak and average performance of the controllers with MAE and MAXERR respectively.
S2 = □[0, 30](μ = 0)
S3: Steady State. S3 is a key indicator to measure the behaviors of AFC for most of the time, where the μ should be not greater than 0.1 for 20 secs of 30 secs.
S3 = ◇[0, 30](□[0, 20](μ ≤ 0.1))
S4: Resilience. S4 is formulated as follows, where the system should return to the steady state within 1 sec once S3 is violated.
S4 = □[0, 30]((μ > 0.1) → ◇[0, 1](μ ≤ 0.1))
S5: Liveness. N/A
MathWorks. 2021. Adaptive Cruise Control System Using Model Predictive Control. https://www.mathworks.com/help/mpc/ug/adaptive-cruise-control-using-model-predictive-controller.html
MathWorks. 2021. Lane Keeping Assist System Using Model Predictive Control. https://www.mathworks.com/help/mpc/ug/lane-keeping-assist-system-using-model-predictive-control.html
MathWorks. 2021. Parking Valet Using Nonlinear Model Predictive Control. https://www.mathworks.com/help/mpc/ug/parking-valet-using-nonlinear-model-predictive-control.html
MathWorks. 2021. Nonlinear Model Predictive Control of an Exothermic Chemical Reactor. https://www.mathworks.com/help/mpc/ug/nonlinear-model-predictive-control-of-exothermic-chemical-reactor.html
MathWorks. 2021. Land a Rocket Using Multistage Nonlinear MPC. https://www.mathworks.com/help/mpc/ug/landing-rocket-with-mpc-example.html
Jin X, Deshmukh J V, Kapinski J, et al. Powertrain control verification benchmark[C]. Proceedings of the 17th international conference on Hybrid systems: computation and control. 2014: 253-262.
Simone Schuler, Fabiano Daher Adegas, and Adolfo Anta. Hybrid modelling of a wind turbine. In Goran Frehse and Matthias Althoff, editors, ARCH16. 3rd International Workshop on Applied Verification for Continuous and Hybrid Systems, volume 43 of EPiC Series in Computing, pages 18–26. EasyChair, 2017.
Yaghoubi S, Fainekos G. Gray-box adversarial testing for control systems with machine learning components[C]. Proceedings of the 22nd ACM International Conference on Hybrid Systems: Computation and Control. 2019: 179-184.
MathWorks. 2021. watertank Simulink Model. https://www.mathworks.com/help/slcontrol/gs/watertank-simulink-model.html