Source Code
We report here all the source code used and developed in our study:
- RNN Encoder-Decoder Model
- Neural Code Translator
- Code Abstraction Tool
- External Tools
RNN Encoder-Decoder Model
RNN Encoder-Decoder Model
The NMT model is bases on the RNN Encoder-Decoder architecture. We rely on the seq2seq implementation.
We provide additional scripts and configurations based on seq2seq. The code is organized as follows:
seq2seq/
: contains scripts for training and inferencetrain_test_small.sh
: script that performs training and testing (for small-size methods)train_test_medium.sh
: script that performs training and testing (for medium-size methods)inference.sh
: script that performs inference and evaluates the performances of the model (it is invoked automatically by the two scripts above)configs/
: contains configurations for the architecture
dataset/
: contains datasets (training, validation, and test sets)50/
: small-size methods dataset50-100/
: medium-size methods dataset
Usage
Usage
Before using our code, make sure to have a working installation of seq2seq. Next, download our code and run the script in the folder seq2seq/
./train_test_<small/medium>.sh <dataset_path> <num_iterations> <model_path> <config_ID>
Parameters
<dataset_path>
: path to the dataset containing the folders: train, eval, test (see dataset folder)<num_iterations>
: number of training iterations<model_path>
: path where to save the model's checkpoints and predictions<config_ID>
: numerical ID of the configuration to be used. The IDs available are from 1 to 10 and they described here.
Example
./train_test_small.sh ../dataset/50/ 50000 ../model/50/ 10
Downloads
Downloads
Download Source Code (10 MB)
Neural Code Translator
Neural Code Translator
Code Abstraction Tool
Code Abstraction Tool
External Tools
External Tools
- AST differencing Tool - GumTreeDiff
- Java Lexer - ANTLR
- Java Parser - JavaParser
- doc2vec