Data
We report here all the data used, extracted, and generated in our study:
Bug-Fixes
Datasets & Predictions
Idioms
Manual Evaluation
BLEU Score tests
Clustering and Silhouette
Bug-Fixes
Bug-Fixing Commits
Bug-Fixing commits metadata extracted during the mining. The CSV file contains the following fields:
ID : Commit HASH ID
Repo_URL : GitHub URL of the repository
Commit_URL : GitHub URL of the bug-fixing commit
Message : Commit message of the bug-fixing commit
Data
Download CSV file (900 MB)
Code from Bug-Fixes
Raw source code extracted from the bug-fixing commits.
Each bug-fixing commit is represented by a folder named as the commit hash ID. In each folder there are two sub-folders:
P_DIR: Java source code files before the bug-fixing commit
F_DIR: Java source code files after the bug-fixing commit
Data
Download data (15 GB)
Extracted Bug-Fix Pairs (BFP)
Method pairs extracted from the bug-fixing commits.
Each bug-fix is represented by a folder with the corresponding commit hash ID. In each bug-fix folder there is a first level of folders representing the files, then a second level of folders representing the methods. In each method folders there are the following files:
before.java : Method's source code before the fix
after.java : Method's source code after the fix
operations.txt : AST operations performed on the method as extracted by GumTreeDiff
signature.txt : Fully qualified signatures of the method before/after the fix
Data
Download data (7 GB)
Datasets & Predictions
Idioms
Manual Evaluation
We share the code sample and evaluation performed in order to assess the characteristics of the mutants generated by the models.
Code Sample
Download data - each text file contains the fixed code and, below, the mutant generated by the model.
Results
Spreadsheet - contains the evaluation performed by the judges.
BLEU Score Tests
We share the source code and the logs for the BLEU score tests.
Source Code
Download code - used to run statistical tests between models and baseline
Logs
Download logs - output logs of the tests.
Clustering - Silhouette
Download data - script to compute silhouette values as well as the results of the distribution (boxplots).