Summary:
Determining the most appropriate learning technique(s) is vital for the accurate and effective software fault prediction (SFP). Earlier techniques used for SFP have reported varying performance for different software projects and none of them has always reported the best performance across different projects. The problem of varying performance can be solved by using an approach, which partitions the fault dataset into different module subsets, trains learning techniques for each subset, and integrates the outcomes of all the learning techniques. This paper presents an approach that dynamically selects learning techniques to predict the number of software faults. For a given test module, the presented approach first locates its neighbor module subset that contained modules similar to test module using a distance function and then chooses the best learning technique in the region of that module subset to make the prediction for test module. The learning technique is selected based on its past performance in the region of module subset. We have performed an evaluation of proposed approach using fault datasets garnered from the PROMISE data repository and Eclipse bug data repository.
This web page provides detailed steps to run the experiments. Details about the tools and techniques used in the experimentation and their parameter values are also provided here. Used software fault datasets and the script files required to run the experiment are also provided on this page.
1. Instruction file
2. Used software fault datasets [PROMISE Data Repositroty, http://openscience.us/repo/]
3. Script files