Our first experiment consisted on carefully inspecting 97 commits from commons-collections (15), commons-lang (55) and commons-math (27) and then manually labelling if we expect or not to obtain a (non-empty) commit-relevant specifications for such commits. The commits consists of code refatoring (i.e. no semantic delta is expected) as well as bug-fixing and bug-inducing commits for which a non-empty semantic delta is expected.
You can find the manually assigned labels (see column 'delta_expected') in the following files: subjects/d4j_ALL_GT_RQ1.csv and subjects/icse_ALL_GT_RQ1.csv.
Then, we run DeltaSpec and measure its effectiveness in inferring commit-relevant specs.
Since many of the expected specifications cannot be currently expressed in the specification language supported by SpecFuzzer, we also manually add a label that indicates if such delta spec is expressible or not in our supported language. See label 'is_explainable' in the csv files. A summary al all expressible subjects ca be found in file subjects/ALL_expresable_subject.csv. Then, we measure the effectiveness of DeltaSpec to infer expressible delta specs.
We analyzed the effectiveness of DeltaSpec for inferring the expected commit-relevant specifications. To do so, we executed DeltaSpec for each one of the commits under analysis, and then inspected the output to determine if it produced some of the delta spec (added or remove) in the cases that it was expected.
To run this experiment, for the common-collections project (similar for lang and math), you can execute the following commands:
$ export SUBJECTS_FILE=subjects/icse_ALL_subjects.csv
$ ./run-collections-rq1.sh
The script will doo all the required setup, generate the test suites for each pre/post commit versions, and compute the commit-relevant specifications using DeltaSpec. At the end the specs folder will contain one folder for each subject with the following files:
delta-added-inferred.txt (added commit-relevant specs)
delta-removed-inferred.txt (removed commit-relevant specs)
delta-preserved-inferred.txt (preserved specifications, outside the delta)
The full set of specifications that DeltaSpec inferred can be downloaded from this link.
DeltaSpec inferred some delta or preserved assertions for 81 out of the 97 commits analysed, and for the remaining ones all produced assertions were invalidated and discarded by the suites.
On average, 11%, 4% and 85% of the inferred assertions correspond to the sets of delta-added, delta-removed and preserved assertions, respectively. The figure shows the percentage of delta-added, delta-removed and preserved assertions among the total number of inferred assertions.
DeltaSpec analysed 97 commits (112 classes), for which a total of 48 and 53 commits (53 and 59 classes, resp.) were expected to have a non-empty and empty delta, respectively. Additionally, as some of the commits involve changes for which a commit-relevant specification capturing the change would be beyond the expressiveness of the currently supported specification language, we also show the performance of DeltaSpec in a subset of expressible subjects for which, in principle, the semantic delta can be captured with a specification in the current language. Such subset is composed of 65 commits and 78 classes.
DeltaSpec achieves an accuracy of 50% when inferring commit-relevant specifications,
and 71% when considering commits which specifications can be expressed in DeltaSpec’s supported specification language.