Source Code

We report here all the source code used and developed in our study:

  • RNN Encoder-Decoder Model

  • Neural Code Translator

  • Code Abstraction Tool

  • External Tools

RNN Encoder-Decoder Model

The NMT model is bases on the RNN Encoder-Decoder architecture. We rely on the seq2seq implementation.

We provide additional scripts and configurations based on seq2seq. The code is organized as follows:

  • seq2seq/ : contains scripts for training and inference

    • train_test_small.sh : script that performs training and testing (for small-size methods)

    • train_test_medium.sh : script that performs training and testing (for medium-size methods)

    • inference.sh : script that performs inference and evaluates the performances of the model (it is invoked automatically by the two scripts above)

    • configs/ : contains configurations for the architecture

  • dataset/ : contains datasets (training, validation, and test sets)

    • 50/ : small-size methods dataset

    • 50-100/ : medium-size methods dataset


Usage

Before using our code, make sure to have a working installation of seq2seq. Next, download our code and run the script in the folder seq2seq/

./train_test_<small/medium>.sh <dataset_path> <num_iterations> <model_path> <config_ID>

Parameters

  • <dataset_path> : path to the dataset containing the folders: train, eval, test (see dataset folder)

  • <num_iterations> : number of training iterations

  • <model_path> : path where to save the model's checkpoints and predictions

  • <config_ID> : numerical ID of the configuration to be used. The IDs available are from 1 to 10 and they described here.

Example

./train_test_small.sh ../dataset/50/ 50000 ../model/50/ 10


Downloads

Download Source Code (10 MB)

Neural Code Translator

NeuralCodeTranslator

Our general Encoder-Decoder model for code is available on GitHub!

Code Abstraction Tool

src2abs

The code abstraction tool src2abs is available on GitHub!

External Tools