Submission for ISSTA

Submission for ISSTA 2020 (Data and Code)

1. Figure 8 Readability

Figure 8 in the paper may be too small to read. Here we present a large version of it.

2. Performance Comparison with other Works.

Comparison with BinGO-E:

The test data is all the vulnerable functions in OpenSSL version from 1.0.1a to 1.0.1u. BinXray has shown better performance than BinGO-E.

Comparison with FIBER:

The test data is the selected vulnerabilities in Android kernels by FIBER. The experiment shows that BinXray has much better performance than FIBER.

3. IoT Firmware Experiments

We have conducted the experiments on different IoT firmware. In total, BinXray successfully identifies 49 vulnerabilities have been patched and 48 vulnerabilities still remain in the firmware. The average accuracy is 81.5%.

We have listed the detailed results of the experiments.

In the tables,

Vulnerable means that the firmware contains the vulnerability, and BinXray manage to identify it.

Patched means that the firmware has fixed the vulnerability, and BinXray manage to find the function and predict it as patched.

FP means that the firmware has fixed the vulnerability, but BinXray reports the fixed vulnerability as vulnerable.

FN means that the firmware contains the vulnerability, but BinXray reports the vulnerability as patched.

Data & Experiments:

The data for the experiments can be downloaded from: https://drive.google.com/open?id=1fvOilzv7MRivgGHqOaj6lwy6F4x2Sjp9

The code for the experiments can be downloaded from: https://drive.google.com/open?id=1G4XZ-8sEVBRUryyA2PcRcxy2KGvSVSiS

Note: in order for the framework to be run, IDA Pro needs to be installed.

Here are the detailed steps to run the code (to re-produce the experiments).

0. Setup the config files:

Config files:

1. _config.csv : There are four parts in the file, which are CVE id , the last vulnerable version , the first patched version , involved functions in order.

2. _func.csv : All involved functions in _config.cvs.

3. _version.csv : All binary versions to be analyzed.

4. gt.csv : optional , you can mark V and P for functions in target binary as ground truth.

You should configure these files at first.

If you have IDA PRO and different binaries, you can extract the features and test on anything you want. Please follow (1 - 3.1)

If you don't have IDA PRO, we have provides some processed data. You can jump to (step 4).

1. Copy and paste the bclass.py in path_of_IDA/Python;

2. Edit extract.py , change the first parameter in line 13 into the path of _func.csv , and change the path in line 128 into the path you want to store the pkl files. If the debug information is not included in binary , you also need to find out the address of functions in the binary and store them in a txt file with format like function name : function address , and change the path in line 137.

3. Edit run.py , change the path in line 14 and 25 into the path of binary folder , change the path in line 28 into the path of extract.py , and change the parameters in line 31 as the path of exe file of IDA and the path to generate log file.

3.1 Then , run the run.py to extract function information in binary. The information will be dumped as pickle files.

4.Edit compare.py , change the path in line 26 into the path of pkl folder , change the path in line 922 and 1287 as _config.csv , change the path in line 994 as _version.csv , and if you have gt.csv , you should config the path in function calculate() and use this function in '__main__' . At last , change line 1282 as the path of json file , which is generated to store the final results , and change the second parameter in line 1283 as the number of versions in _version.csv.

4.1 Then , run the compare.py , and the results will be stored in the json file and the statistical results are shown on terminal.

Page updated

Google Sites

Report abuse