Are Our Clone Detectors Good Enough ? An Empirical Study of Code Effects by Obfuscation

THE STRATEGIES LEVERAGED IN OUR WORK

In this work, we consrtuct 69 strategies of 39 types which include the basic strategies and combined ones. The strategies and their corresponding obfuscators are listed as follows

THE CHANGE OF THE DETECTION RATE OF THE CLONE DETECTION TOOLS UNDER THE IMPACT OF STRATEGIES OF THE OBFUSCATORS

Utilizing the strategies corresponding to each obfuscator, we evaluate the impact brought by them on the detetction rate of the clone detectors selected in this work on the true clone pairs (Type1,2,3,4,5), which is measured by the change of the Precison, Recall and F1-score , where positive number means a drop and negative one means the opposite

Impact of Strategies of Radon

We constrcut 20 strategies utilizing the obfuscator Radon, and test them on the six clone detectors selected in our work, the results are listed as follows. Specifically, 20 obfuscation test data sets are genereated corresponding to the 20 strategies, which are detetcted through the clone detection tools, calculating the detection results and the difference with the ones of the original data set before obfuscation, the statistical results are listed as follows.

Impact of Strategies of JBCO

We constrcut 20 strategies utilizing the obfuscator JBCO, and test them on the six clone detectors selected in our work, the results are listed as follows. Specifically, 20 obfuscation test data sets are genereated corresponding to the 20 strategies, which are detetcted through the clone detection tools, calculating the detection results and the difference with the ones of the original data set before obfuscation, the statistical results are listed as follows.

Impact of Strategies of Obfuscator

We constrcut 26 strategies utilizing the obfuscator Obfuscator, and test them on the six clone detectors selected in our work, the results are listed as follows. Specifically, 26 obfuscation test data sets are genereated corresponding to the 20 strategies, which are detetcted through the clone detection tools, calculating the detection results and the difference with the ones of the original data set before obfuscation, the statistical results are listed as follows.

Impact of Strategies of JODE, ProGuard, yGuard

As only one strategy belonging to all these three obfuscators originally, we test this one of them respectively ont the six clone detectors in this work. Specifically, one obfuscation test data set is genereated corresponding to the one strategy owned by each of them respectively, which are detetcted through the clone detection tools, calculating the detection results and the difference with the ones of the original data set before obfuscation, the statistical results are listed as follows, from left to right are : JODE, ProGuard, yGuard

THE CHANGE OF THE DETECTION RATE OF THE CLONE DETECTION TOOLS ON THE FALSE CLONE PAIRS UNDER THE IMPACT OF STRATEGIES OF THE OBFUSCATORS

We not only measure the impact of the obfuscators on the detection rate of the true clone pairs, but also measure it on the false ones. The shaded part of the tables in this section indicate that there is a increase in false positive rate after obfuscation, which we should pay special attention to. Related experiment process and configuration are just the same as the true clone pairs'.

Original FP Rate of the Clone Detectors

Impact of Strategies of Radon

Impact of Strategies of JBCO

Impact of Strategies of Obfuscator

Impact of Strategies of JODE, ProGuard, yGuard

THE RUNNABILITY OF THE PROGRRMAS OBFUSCATED BY THE OBFYSCATORS SELECTED IN OUT WORK

We select 300 programs from Google Code Jam, and test the runnablity of them after obfuscation by the strategies which have been proven to be representative during the evaluation work corresponding to each obfuscators, evaluating the runnability of the programs obfuscated by them. Specifically, we select 34 strategies in total and generate 34 groups of programs after obfuscation, the relative datails are listed in the following table.

THE TYPES OF THE TRUE CLONE PAIRS & FLASE CLONE PAIRS AFTER DECOMPILATION

As we should make sure that the data type of both the true clone pairs and false clone pairs keep the original type after decompilation during the generation of the original data set process, we randomly select 100 pieces of records covering both the true or false clone pairs and analyze them manually. The analysis result shows that 92 pairs of them retain the original clone type except for individual failures due to the compilation optimization, the percentage of which reaches 92%. The related details are listed in the following table.

CONFIGURATIONS OF OBFUSCATION TOOLS