Model Reliability

To further investigate the reliability of migrated/quantized models, apart from the publicly available testing data of MNIST and CIFAR-10, we combine the existing tools TensorFuzz [1] and DeepTest [2], and generate a large-scale testing data to capture the differential behaviors between the PC model and the migrated/quantized model. We create 25,000 testing data for the DNN models of LeNet-1 and LeNet-5, respectively. Similarly, 28,000 testing data is generated for ResNet-20 and VGG-16, respectively. Therefore, there are 106,000 generated testing data in total for the the four DL models used on both mobile and web platforms.

The entire generated dataset for all the four models can be downloaded from Generated Data for Four Models.

Examples of generated images for each model are as follows:

LeNet-1

LeNet-5

ResNet-20

VGG-16

References

Augustus Odena and Ian Goodfellow. 2018. TensorFuzz: Debugging Neural Networks with Coverage-Guided Fuzzing. arXiv preprint arXiv:1807.10875 (2018).
Yuchi Tian, Kexin Pei, Suman Jana, and Baishakhi Ray. 2018. Deeptest: Automated testing of deep-neural-network-driven autonomous cars. In Proceedings of the 40th international conference on software engineering. ACM, 303–314.

Google Sites

Report abuse