ICEBAR is a technique and tool that builds upon ARepair, a test-suite based repair tool for Alloy. ICEBAR receives a faulty Alloy specification with a property-based oracle, and uses Alloy’s counterexamples to feed ARepair, and produce a repair candidate. ICEBAR includes different mechanisms, with different degrees of reliability, to generate counterexamples from predicates and assertions.
Overview of the technique
The technique starts from a given faulty specification (with property-based oracles included) and a test suite, and runs ARepair to attempt to produce a patch. If a patch is found, we know that is passes the test suite; we take this candidate and contrast it against the property-based oracle in the specification. If, again, the specification meets this oracle, we consider the patch a proper fix. If, on the other hand, some property fails in the candidate patch, we use Alloy to produce instances (assertion counterexamples, violations to predicates), which are in turn translated into new unit tests, to complement the original test suite and start over the ARepair process. Eventually, a fix is found, that passes the corresponding test suite and property-based oracle, and is returned to the user.
Experimental evaluation
The technique's evaluation is centered around three main research questions:
Does ICEBAR improve the repair effectiveness of ARepair?
Are the tests produced by ICEBAR relevant to the repair process?
How does ICEBAR compare to other Alloy repair tools?
To answer these research questions we consider two different benchmarks of faulty Alloy specifications: the Alloy4Fun project, and the benchmark of faulty specifications originally used to evaluate ARepair. The Alloy4Fun benchmark consists of 6 different template models, with a total of 1936 human-written fault variants, based on specific modeling assignments resolved by different students. The ARepair benchmark, on the other hand, is composed of 38 different faulty specifications drawn from various domains. For each of these models, we consider a corresponding correct version as oracle.
Detailed information of the evaluation is shown here.
In order to replicate the experiments, detailed instructions can be found here.