According to Definition 1-3, we calculate the fitness value for each generated malicious app. The fitness value is used as the guidance to the feature selection in the next generation. We inspect all apps in the new generation. If no new feature combination is produced, the evolution process converges and would be terminated as line 21 in Algorithm 1. Finally, we get the collection of new generated malware that serves the benchmark for auditing AMTs.