ANDetect
Content
The source code
Executable tool
Experiment - 1: Performance of ANDetect
Experiment - 2: Robustness in encrypted AP detection
Experiment - 3: Novel ad libs and adware
The artifacts list
1.The source code.
2.Executable tool --- ANDetect.jar, resources, demo_apks.
3.Experiments:
(1) Dataset(Corresponding to Section 4.1):
AD_Network_info.csv(162 AN platforms) --- The dataset of Ad networks including the SDK's name and the link of developer document.
adware.csv(AP_{adware}*), appchina.csv and googleplay.csv(AP_{truth}*) --- The low-confidence labeled dataset of experiment.
(2) Experiment-1(Corresponding to Section 4.3):
adware_mix_final.csv(mix AP_{adware}^{100} and AP_{adware}^{200}) --- The high-confidence dataset of 100 encrypted APKs and 200 non-encrypted APKs in experimental envionment.
truth_mix_final.csv(mix AP_{truth}^{100} and AP_{truth}^{200}) --- The high-confidence dataset of 100 encrypted APKs and 200 non-encrypted APKs in real-world envionment.
exp1_adw_final.csv --- The evaluation result of ANDetect, LibD, LibRadar and LibScout derived from adware_mix_fianl.csv.
exp1_truth_final.csv --- The evaluation result of ANDetect, LibD, LibRadar and LibScout derived from truth_mix_fianl.csv.
compute_result.py --- Execute it and then get exp1_adw_final.csv and exp1_truth_final.csv.
(3) Expriment-2(Corresponding to Section 4.4):
exp2_table3_new.png --- The fixed result of encrypted application detection in resource confused applications.
robust_adware.csv, robust_truth.csv --- The 100 non-encrypted applications randomly selected from each of $AP_{adware}*$ and $AP_{truth}*$ that perform resource obfuscation.
(4) Expriment-3(Corresponding to Section 4.5):
exp3_table4.csv --- Malicious ad libraries analyzed by VT.
is_adware.csv --- Brand-new adware in real-world environment.
other_AN-label.csv --- The novel ad libs detected by ANDetect from $AP_{adware}$ and $AP_{truth}$ and checked manually. The statistical result is showed in Fig.5 in the revised paper.
compute_label_rate.py --- This script is used to compute the relevance between Advertising Network and "adware" label given by VT.
Note: Some artifacts are not publicly available because of one of the following reasons:
Follow-up research is still ongoing and will require the use of these data, for instance, the different versions of advertising libraries.
The data in the experiment have been updated in the revised paper, such as table4 and table7 in Section 4.5.