In automated testing, especially test generation in the absence of specifications, a large amount of manual effort is spent on test-result inspection. Test selection helps to reduce this effort by selecting a small subset of tests that are likely to reveal faults. A promising test-selection approach is to dynamically mine operational models as potential test oracles and then select tests that violate them. Existing work adopting this approach mines operational models based on dynamic invariant detection. In this paper, we propose to mine common operational models, which are often but not always true in all observed traces, from a (potentially large) set of unverified tests. Specifically, our approach collects branch coverage and data value bounds at runtime and then mines implication relationships between branches and constraints of data values as potential operational models after running all the tests. Our approach then selects tests that violate the mined common operational models for result inspection. We have evaluated our approach on a set of programs, compared with previous code-coverage-based, clustering-based, dynamic-invariant-based, and random selection approaches. The experimental results show that our approach can more effectively reduce the number of tests for result inspection while revealing most of the faults.
The subjects used in our evaluation are as shown below:
The characteristic of the subject programs is shown in this table. In these subjects, there are many trivial faults that fail on more than 5% of the tests. Such faults are less likely to be seen in practical setting. To better evaluate the potential effectiveness of test selection approaches in practice, we conduct additional experiments on only the nontrivial faults.
All of these subjects and the test suites are downloaded from the Subject Infrastructure Repository.
We collect the execution traces using the Cooperative Bug Isolation (CBI) tools. Specifically, we use two instrumentation schemes, including "branches" and "bounds", to collect branch coverage and data value bounds.
We compare our approach with the following existing test selection approaches:
We implement our approach, random selection, code-coverage-based approach, and clustering-based approach in MATLAB. For the (dynamic-invariant-based) operational difference approach, we adopt the results presented by Harder et al. (They presented results in the Siemens suite and the Space program). We do not re-implement the approach and experiment in the grep program. This is because in our preliminary experiments, it takes a lot of time to repeatly check and update dynamic invariants.
Source Code: Our Approach, Random Selection, Code-coverage-based Approach, Clustering-based Approach.
Example Inputs (execution traces of the 1st version of print_tokens): trace_branches.mat, trace_bounds.mat, run_result.
(a) Results for the Siemens suite
(b) Results for the Space program
Figure 1: Results of all the faults
(c) Results for the grep program
(a) Results for the Siemens suite
(b) Results for the Space program
Figure 2: Results of the nontrivial faults
(c) Results for the grep program
Our approach mines two kinds of rules: control rules (implication relationships between branches) and data rules (constraints of data values). In the figures, "Control Rules" indicates the effectiveness of our approach considering only control rules, and "Data Rules" indicates the effectiveness of our approach considering only data rules.
All of the experiments were carried out on a 1.7GHz Intel Pentium Dual-Core PC with 1.5GB physical memory, running Fedora Core 6.
Thank you for your interest in our work. Please contact Wujie Zheng for any additional information.