FSE 2015 replication package

Welcome

Welcome to the homepage of the replication package for the experiments reported in our paper:

P. Braione, G. Denaro and M. Pezzè. Symbolic Execution of Programs with Heap Inputs. In Proceedings of the 10th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE 2015), pp 602-613.

The package provides an environment that allows to replicate the experiments with a (hopefully) small setup effort. It contains binary and source code distributions of the experimental subjects, and most of the tools used to produce the experimental data. The supported platforms are Mac OS X, Linux and Windows.

Contacts

For any issue or suggestion please contact Pietro Braione.

Downloading

You can download the package here. It is a quite large (278 Mbytes) .zip file.

Setting up

The replication package does not contain all that you need: For matter of convenience or for licensing reasons we could not include everything, so you will need to do some upfront effort and manually fix the missing dependencies. You will need to install a working copy of Sicstus Prolog (you can obtain an evaluation copy at https://sicstus.sics.se/, we used an old 3.12.11 release, but the current version should yield the same results). Note that JBSE can use Z3 as a constraint solver instead of Sicstus, but all the experiments in the paper have been produced with Sicstus, and we ensure no faithful replication of results if you use a different solver. Consider however that using Z3 yields faster execution times and a simpler setup (see below for details). You will also need to install a working copy of the PALE and MONA tools (you will find PALE at http://www.brics.dk/PALE/ and MONA at http://www.brics.dk/mona/). We used PALE 1.0-9 and MONA 1.4-15 (note that it is not the version of MONA that is included with PALE, but the latest version at the MONA homepage). The PALE and MONA binaries must be installed in the same directory.

The package itself is in the form of an Eclipse workspace. You will need a working setup of Eclipse (we used the Eclipse Luna distribution, but any recent Eclipse distribution will suit). As you can see this distribution just contains this instructions file and a folder named FSE2015. Run Eclipse, then select File > Switch workspace... and point to the FSE2015 folder, wherever you want to put it. After bootstrap you will see an Eclipse workspace with five projects:

dataStructures: the project containing the subject programs for the P1-P4 experiment series;
tsafe: same for the P5 experiment series;
closure-37b: same for the P6 experiment series;
closure-72b: same for the P7 experiment series;
FSE2015: this project contains all the code necessary to start the experiments and gather the results.

More setup

Once done all the above you are not over with the setup. The FSE2015 project references the external tools so you will need to set some variables and project dependencies to make everything work.

JBSE interacts with Sicstus via the PrologBeans library that is included with every Sicstus distribution. You must set the build path of the FSE2015 project so it points to your local setup of PrologBeans. In the Eclipse Package Explorer right-click the FSE2015 project and then select Build Path > Configure Build Path..., then choose the Libraries tab. You will notice a (almost surely) broken reference to prologbeans.jar: Remove it and add the right one (click the Add External JARs... button). On Mac OSX and Linux systems the default setup directory of prologbeans.jar is /usr/local/lib/sicstus-<VERSION>/bin. Refer to the comprehensive Sicstus manual for more informations on how to locate prologbeans.jar on your system. If you choose to use Z3 instead, you do not need to configure for Sicstus: Just remove the broken reference to prologbeans.jar and ignore the resulting compilation errors.
The Alloy Analyzer needs a reference to a native SAT solver library (the default SAT4J is too slow given the size of the experiments). The replication package includes some of them, and the FSE2015 project is configured to point to the amd64-osx version, so you will have to fix this dependency in the case your operating system is different. As done for PrologBeans, open the Libraries tab of the Configure Build Path settings of the FSE2015 Eclipse project. Then, select alloy-dev.jar and open its drop-down list (click the small triangular arrow on the left of the jar icon). Then, click the Native Library Location row and the Edit... button. The native libraries are in the FSE2015/lib folder, and are named after the platform (there are binaries for Mac, Windows, Linux and FreeBSD).
There are some Java variables in the FSE2015 project that should reflect your setup. Open the FSE2015 project, then the source folder named "runner", and then edit the file runner/Settings.java. You will have to fix, at minimum, the SICSTUS_PATH (or Z3_PATH) and PALE_MONA_PATH variables. SICSTUS_PATH and Z3_PATH must point to the binary executable file of respectively Sicstus and Z3, while PALE_MONA_PATH must point to the folder (note: just the folder name!) containing the PALE and MONA binaries.

Running the experiments

To run the experiments the FSE2015 project offers a convenient runner class FSE2015.java class. It can be invoked from the command line or from Eclipse. The most convenient way to launch the experiments is from Eclipse, by using the (only) launch configuration available in the Eclipse workspace: Under Eclipse click the main menu Run > Run Configurations..., then select Java Application > FSE2015 on the left, click the Arguments tab on the right and under the Program arguments text box insert your favorite command line args as reported below (e.g., -f experiments/P7.txt to run the P7 series of experiments).

Note that symbolic execution is resource-hungry, both in terms of CPU cycles and in terms of memory consumption. My JVM (Hotspot) setting is -Xms512m -Xmx1024m, but your mileage may vary. If you want to change these setting in the Eclipse launch configuration click the main menu Run > Run Configurations..., select Java Application > FSE2015 on the left, click the Arguments tab on the right and under the VM arguments text box change the amount of memory specified for the -Xms and/or -Xmx options.

If you really want to launch the experiments from the command line, the syntax is:

java -Xms512m -Xmx1024m -Djava.library.path=<pathToSATSolver> -cp <classPath> runner.FSE2015 [-f <fileName>] [-t <nThreads>] [<nExperiment>]

where:

- <pathToSATSolver> must be the path to the native SAT solver library for your platform, as discussed in the "More setup" section above;
- <classPath> must contain all the jars in the lib/ directory of the FSE2015 project, the PrologBeans jar file (if you use Sicstus), and the bin/ directories of all the projects;
- <fileName> is the name of some experiments file; you will find them under the experiments/ folder of the FSE2015 project; by default the experiments/examples.txt experiments file will be launched;
- <nThreads> is the number of threads in the thread pool; to maximize throughput the FSE2015 experiment runner can use a thread pool and launch many experiments in parallel; by default it launches one experiment a time;
- <nExperiment> is a line number in the experiments file; if specified, the runner will execute only the experiment at that line number, if not specified all the lines in the experiments file will be processed.

The experiments file format

In the experiments folder of the FSE2015 project you will find some pre-canned experiments files. The files P1.txt to P7.txt can be used to reproduce the results of the experiments series P1 to P7 as reported in the paper. Beware! The times for reproducing a whole series of experiments are very long. For this reason we sorted the experiments in the files as to maximize throughput (from the fastest to the slowest), and we also offer an example.txt file with a small selection of experiments that are sufficiently fast to run. If you really want to run all the experiments be ready to wait about three-four days for completion.

Should you ever mangle with the experiments files, e.g. to finely control which experiments you want to launch, you should be aware of the experiments file format. Every row can be either a comment, when the row is started by a double dash (--) or an experiment in the form:

<subjectName>:<entryPoint>:<technique>

where:

- <subjectName> can be one of: doubly_linked_list, tree_map, avl_tree, node_caching_linked_list, tsafe, closure37b, closure72b for, respectively, what in the paper is called P1, P2, P3, P4, P5, P6 and P7;
- <entryPoint> is the name of the method that must be executed symbolically. For tsafe this is always TS_R_1, for closure37b is always traverse, for closure72b is always process. For the data structures subjects they have form test_<num>, where <num> is a suitable number - they are too many to be listed here, please refer to the settings files P1.txt to P4.txt for a complete list.
- <technique> is the name of a technique for representing preconditions and structural invariants of heap data structures. It can be one of: plainInit (no technique, indicated Lazy in the paper), hex, preconditioned (indicated PreP in the paper), consRepOk (indicated ConsP in the paper), alloy, pale.

By using the provided experiments files as templates you can build your own files with the experiments you prefer. For convenience you can also comment out whole regions of the experiments file by putting a /* mark at the beginning of a row, and a */ mark at the end of a row (only beginning and end - no sophisticated parsing, sorry).

Interpreting the output

Every experiment will symbolically execute a method in one of the subject programs, gather the results, and produce them in a single output text file. All the text files are located in the out folder of the FSE2015 project, and they are named after their corresponding line in the experiments file, <subjectName>.<entryPoint>.<technique>.txt. The typical content of one of these files will be something like:

This is the Java Bytecode Symbolic Executor's Run Tool (JBSE v.0.7).

Connecting to Sicstus at /usr/local/bin/.

Starting symbolic execution of method doubly_linked_list/TestDoublyLinkedList:()V:test_X at Mon Jun 08 14:50:57 CEST 2015.

Symbolic execution finished at Mon Jun 08 14:50:58 CEST 2015.

Analyzed states: 5640, Analyzed traces: 25, Safe: 19 (11 concretizable), Unsafe: 4 (2 concretizable), Out of scope: 1 (0 concretizable), Violating assumptions: 1.

Elapsed time: 887 msec, Average speed: 635 states/sec, Elapsed concretization time: 713 msec, (80,38% of total), Elapsed time in decision procedure: 42 msec (4,74% of total).

Let us concentrate on the numbers, and explain how they are related to the numbers reported in the paper's figures.

- The traces are either valid or invalid. Valid traces are marked as safe (no assertion violated) or unsafe (some assertion violated). Invalid traces are marked either out of scope (not enough computing resources to analyze them) or violating assumptions (an assumption violation detection can be injected at any point of a symbolic execution by means of instrumentation code).
- Traces are either concretizable or not. Concretizable traces are all the traces whose initial symbolic state does not contradict any of the preconditions and data structure invariants. Otherwise the trace is said to be not concretizable.
- The E' measure is the number of valid, not concretizable traces. In the previous example it is E' = (19 - 11) + (4 - 2) = 10. This measure is the one reported on the third column of Table II and in Figure 4.
- The E'' measure is the ratio between the number of valid concretizable traces of the technique, and the number of valid concretizable traces of the no technique (plainInit, Lazy) case. Its distribution is reported in Figure 5. In the above example, if the no technique case had the following result:

...

Analyzed states: 5640, Analyzed traces: 17, Safe: 15 (9 concretizable), Unsafe: 2 (1 concretizable), Out of scope: 0 (0 concretizable), Violating assumptions: 0.

...

- it would be E'' = (11 + 2) / (9 + 1) = 1.3.
- The E''' measure is obtained from the elapsed time minus the time of the concretization check, divided by the average time of the HEX technique. In the above example, if the HEX technique would take on average 100 msec to complete, it would be E''' = (887 - 713)/100 = 1,74. The distribution of the obtained E''' values is reported in Figure 6.
- Finally, the columns "#Traces", "#Alarms" and "#False alarms" of Table II are obtained, respectively, as the total number of valid traces (19 + 4 in the first example), the number of unsafe traces (4 in the first example) and the number of unsafe traces that are not concretizable (4 - 2 in the first example).

What is replicable and what isn't? Some advices

If we optimistically assume that no bugs have been introduced while making this replication package as user-friendly as possible, then only one of all the previously introduced measures will not be exactly replicable: E''', the execution time.

A small caveat. For maximum replicability we packed the projects with the precompiled binaries for the subject programs. If you want to recompile the subject programs take into account that there is a noticeable difference in the binaries produced by different Java compilers. If you recompile anything with a compiler different from the one we used (the Eclipse Java Compiler 4.4.1) the results may differ. For convenience we included a copy of the Eclipse Java Compiler 4.4.1 under the tools/ folder of the FSE2015 project.