We have used the two following datasets in the paper:
Benchmark (BM) dataset
It contains 5,560 malware apps from Drebin and 5,000 beingn apps from Google Play.
Drebin dataset could be downloaded from: https://www.sec.cs.tu-bs.de/~danarp/drebin/index.html
In-the-wild (ITW) dataset
It contains 42,910 malware apps and 44,347 benign apps from 'AndroZoo' dataset.
AndroZoo dataset could be downloaded from https://androzoo.uni.lu/
The names of the malware and bening apps (i.e., MD5 hashes) are made available through the two following files:
Malware app names: https://drive.google.com/open?id=0B5UPYObHDu7iMFBZM3ZNWHFvZTA
Benign app names: https://drive.google.com/open?id=0B5UPYObHDu7ibVk3SjJVZk1peHc