Summary
Test coverage is an important metric of software quality, since it indicates thoroughness of testing. In industry, test coverage is often measured as statement coverage. A fundamental problem of software testing is how to achieve higher statement coverage faster, and it is a difficult problem since it requires testers to cleverly find input data that can steer execution sooner toward sections of application code that contain more statements.
We created a novel fully automatic approach for aChieving higher stAtement coveRage FASTer (CarFast), which we implemented and evaluated on twelve Java applications whose sizes range from 300 LOC to one million LOC. We compared CarFast with several popular test case generation techniques, including pure random, adaptive random, and Directed Automated Random Testing (DART). Our results indicate with strong statistical significance that when execution time is measured in terms of the number of runs of the application on different input test data, CarFast outperforms the evaluated competitive approaches on most subject applications.
Publication
Experiment
We performed large-scale experiments on Amazon EC2 with the following configuration (m1.large): 7.5 GB RAM, 4 EC2 Compute Units (2 virtual cores with 2 EC2 Compute Units each), 35 GB instance storage. For each of 30 runs of each experiment with each application under test (AUT), we run it for 24 hours of time limit (which is chosen experimentally) to establish what coverage can be achieved for this AUT. That is, the total execution time is 1,440Ă—24=34,560 hours. With the cost of $0.48 per instance per hour as of March, 2012, the estimated cost of this experiment is USD 16,500, and the actual cost was USD 30,000.
We used 12 Java programs, whose size ranges from 300 LOC to over 1 MLOC. The characteristics of the benchmarks are in the following table.
Here is the table summarizing the results of the experiments. The results show that CarFast outperforms evaluated competitive approaches with most subject applications. See details in the paper.
Here is the list of programs we used for our experiments.
1. CarFast (Mandatory. Benchmark programs are included.)
2. Query Evaluator (Mandatory)
3. Dsc/Dumper mode (Optional. The binary is included in CarFast.)
4. Static Coverage Estimator (Optional. The result files are included in CarFast.)
5. RUGRAT (Optional. The generated Java programs are included in CarFast.)
Acknowledgments
This material is based upon work supported by the National Science Foundation under Grants No. 0916139, 1017633, 1217928, 1017305, and 1117369, as well as Accenture. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
People
CarFast is a collaborative work of Georgia Tech, the University of Texas at Arlington, the University of Illinois at Chicago, North Carolina State University, and Accenture Technology Labs.
E-mail: sangmin.park [at] gmail [dot] com
Affiliation: Georgia Tech
E-mail: ishtiaque.hussain [at] mavs [dot] uta [dot] edu [discontinued]
Affiliation: University of Texas at Arlington
E-mail: csallner [at] uta [dot] edu
Affiliation: University of Texas at Arlington
E-mail: bhossa2 [at] uic [dot] edu
Affiliation: University of Illinois at Chicago
E-mail: ktaneja [at] ncsu [dot] edu
Affiliation: North Carolina State University
E-mail: drmark [at] uic [dot] edu
Affiliation: Accenture Technology Labs and University of Illinois at Chicago
E-mail: chen.fu [at] accenture [dot] com
Affiliation: Accenture Technology Labs
E-mail: qing.xie [at] accenture [dot] com
Affiliation: Accenture Technology Labs