To further investigate the reliability of migrated/quantized models, apart from the publicly available testing data of MNIST and CIFAR-10, we combine the existing tools TensorFuzz [1] and DeepTest [2], and generate a large-scale testing data to capture the differential behaviors between the PC model and the migrated/quantized model. We create 25,000 testing data for the DNN models of LeNet-1 and LeNet-5, respectively. Similarly, 28,000 testing data is generated for ResNet-20 and VGG-16, respectively. Therefore, there are 106,000 generated testing data in total for the the four DL models used on both mobile and web platforms.
The entire generated dataset for all the four models can be downloaded from Generated Data for Four Models.
Examples of generated images for each model are as follows: