The RL model-controlled vehicles collide with the buildings or other vehicles in the crashes automatically found by MDPFuzz. There are no collisions in the initial states and the model-controlled vehicle's initial speed is 0, and these initial states have passed the validation of CARLA.
Figure 1: RL, CARLA case 1.
Figure 2: RL, CARLA case 2.
Figure 3: RL, CARLA case 3.
Figure 4: RL, CARLA normal case (no crash).
The IL model-controlled vehicles collide with the buildings or other vehicles in the crashes automatically found by MDPFuzz. There are no collisions in the initial states and the model-controlled vehicle's initial speed is 0, and these initial states have passed the validation of CARLA.
Figure 9: IL, CARLA case 1.
Figure 10: IL, CARLA case 2.
Figure 11: IL, CARLA case 3.
Figure 12: IL, CARLA normal case (no crash).
The agents are purple circles and the landmarks are black circles in the figures below. The goal of the MARL-controlled agents is to reach the landmarks without colliding with each other. The MARL model-controlled agents (the purple circles) collide with each other in the crashes automatically found by MDPFuzz. There are no collisions in the initial states and the model-controlled agents' initial speeds are all 0.
Figure 13: MARL, Coop Navi case 1.
Figure 14: MARL, Coop Navi case 2.
Figure 15: MARL, Coop Navi case 3.
Figure 16: MARL, Coop Navi normal case.
The RL model-controlled BipedalWalker falls (its head touches the ground) in the crashes automatically found by MDPFuzz. There are multiple frames where the ground types are flat to make sure the model-controlled agent won't fall initially. Also, there are multiple frames of flat ground between two hurdles to ensure that the agent can pass the obstacles by taking optimal actions.
Figure 17: RL for BipedalWalker case 1.
Figure 18: RL for BipedalWalker case 2.
Figure 19: RL for BipedalWalker case 3.
Figure 20: RL for BipedalWalker normal case.
In ACAS Xu, the DNN model-controlled airplane collide with the intruder airplane in the crashes automatically found by MDPFuzz. There are no collisions in the initial states and the collision can be avoided by taking optimal actions. We show the crashes in the models before repair can be avoided in the models after repair. (See Section 7.4 in our paper for the details on model repairing.)
Figure 13 (a): DNN, before repair, ACAS Xu case 1.
Figure 13 (b): DNN, after repair, ACAS Xu case 1.
Figure 14 (a): DNN, before repair, ACAS Xu case 2.
Figure 14 (b): DNN, after repair, ACAS Xu case 2.
Figure 15 (a): DNN, before repair, ACAS Xu case 3.
Figure 15 (b): DNN, after repair, ACAS Xu case 3.
To make the effect of AEs easier to be observed, we replace five consecutive frames with AEs generated by DNN testing. The agent turns right with these generated inputs.
Figure 21: Normal dirving situation.
Figure 22: Inconsistent behaviors found by DNN testings.
Aligned with Fig.8 in our paper, we present the state sequence coverage visualization results of various models solving MDPs below. These results are obtained by running MDPFuzz without and with state sequence density for one hour. To compare these distributions fairly, we have projected the X and Y to the same range.
From the results below, we observe that the state sequence distribution is becoming wider with the guidance of state sequence density, indicating that MDPFuzz can efficiently cover diverse state sequences with state sequence density guidance.
Figure 23: State sequence coverage of fuzzing RL model for CARLA.
Figure 25: State sequence coverage of fuzzing IL model for CARLA.
Figure 27: State sequence coverage of fuzzing MARL model for Coop Navi.
Figure 24: State sequence coverage of fuzzing ACAS Xu.
Figure 26: State sequence coverage of fuzzing RL model for BipedalWalker.