RQ3: How effective are combined

multi-representation models?

In this Research Question we build and evaluate two combined models relying on different code representations.

  • A CloneDetector that based on the different representation is able to predict whether two code fragments are clones or not.

  • A CloneClassifier that based on the different representation is able to predict not only whether two code fragments are clones or not, but also their clone type.

We rely on Random Forest.

CloneDetector

The first zipped files contains the dataset used to train and test the model (10-fold cross validation has been employed). The dataset is extracted from the manual validation performed in RQ1.

The second zipped file contains the logs of each training/testing execution considering all the possible subsets of representations.

CloneClassifier

The first zipped files contains the dataset used to train and test the model (10-fold cross validation has been employed). The dataset is extracted from the manual validation performed in RQ1.

The second zipped file contains the logs of each training/testing execution considering all the possible subsets of representations.