Content

The Python Jupyter Notebook scripts that were used for automating the data collection process* can be found here. More specifically, the link includes the following scripts:
- Fetch GitHub projects.ipynb - fetch multiple consecutive versions of a selected application from GitHub repository
- Perform Gradle/Maven Analysis.ipynb - compile fetched versions of a selected application using Gradle or Maven and then analyze each version using SonarQube tool
- Fetch SonarQube measures.ipynb - fetch the analysis results (TD-related measurements) of a selected application from SonarQube API and then store the analysis results in .csv format
The Python Jupyter Notebook script that was used during data exploration and feature selection process can be found here. More specifically, the link includes the following script:
- Data Exploration and Feature Selection.ipynb - generate descriptive statistics and perform correlation, univariate and multivariate analysis on the extended dataset
The boxplots of TD indicators generated during the data exploration process can be found here.
The "backward elimination" intermediate results (comprising a descriptive table and the Python logs) can be found here.
The dataset that was used during the TD forecasting process can be found here. More specifically, the link includes the following .csv files:
- _benchmark_repository_measures.csv - the extended dataset that was used for the feature selection process (i.e. descriptive statistics, correlation, univariate and multivariate analysis). Contains TD-related metrics extracted from SonarQube and CKJM Extended
- 15 .csv files that contain TD metrics and measurements of each open-source application, used as input for the TD forecasting models
- 2 anonymised .csv files that contain TD metrics and measurements of the 2 anonymised industrial applications (Project A and Project B), used within the context of the case study
The Python Jupyter Notebook scripts that were used during the TD forecasting process can be found here. More specifically, the link includes the following scripts:
- 15 .ipynb scripts - the scripts that were used to train, test, benchmark and execute the TD forecasting models for each application
Indicative visualizations of the forecasting results generated during the model execution process can be found here. More specifically, the link includes the following figures:
- 15 figures illustrating TD Principal forecasting results for 20 versions ahead using Random Forest and the Direct approach, for each application under investigation.
- 2 anonymised figures illustrating TD Principal forecasting results for 10 versions ahead of the 2 anonymised industrial applications (Project A and Project B), used within the context of the case study
The figures of the various identified TD Principal trend cases of the 2 anonymised industrial applications (Project A and Project B), used within the context of the case study can be found here. More specifically, the link includes the following figures:
- 4 figures illustrating abrupt TD Principal trends of Project A
- 3 figures illustrating abrupt TD Principal trends of Project B

* The analysis of selected applications using CKJM Extended could not be automated due to tool limitations and therefore was performed manually

Page updated

Google Sites

Report abuse