LEOPARD -- An attack surface identification tool
Introduction
LEOPARD is a generic, lightweight and extensible framework to identify attack surfaces at the function level through program metrics.
It does not require any user inputs or any prior knowledge about known vulnerabilities, and does not need to compile the target application and thus supports partial code.
In particular, LEOPARD works in two steps by combining two sets of systematically derived metrics : complexity metrics and vulnerability metrics.
- First, it uses complexity metrics to group the functions into a set of bins.
- Second, it leverages vulnerability metrics to rank the functions in each bin, and identifies the top ones as potentially vulnerable.
Experimental results on 11 real-life projects have demonstrated that LEOPARD can cover 74.0% of vulnerabilities by identifying 20% of functions as vulnerable. Based on the identified vulnerable functions in current stable release of MJS, XED, radare2, FFmepg and PHP, we discovered 22 new bugs, among which 8 are zero-day vulnerabilities.
Ground Truth Sample -- Libav 11.8
The list of CVEs reported by July 2018 to version 11.8 of Libav is available here.
For all 14 CVEs, we succeeded to collect patches for 6 of them : CVE-2017-16803, CVE-2017-9051, CVE-2016-9821, CVE-2016-9820, CVE-2016-9819, CVE-2016-8675.
You can download the patches from here.
CVEs Discovered with the Help of LEOPARD
PHP 5.6.30 (6 CVEs) : CVE-2017-9224, CVE-2017-9225, CVE-2017-9226, CVE-2017-9227, CVE-2017-9228, CVE-2017-9229
FFmpeg 3.1.3 (1 CVE) : CVE-2018-4394
radare2 2.7.0 (1 CVE) : CVE-2018-14017
Cumulative Plot for the Distribution of Vulnerable and Overall Functions
In the following figures, x-axis is the complexity score, y axis is the cumulative portion.
From the plots we can see, the cumulative line of overall functions increases sharply in the beginning where bins are with smaller complexity functions. But the increase of vulnerable functions happens with bins with relatively larger complexity scores. Sparsity of bins with larger complexity-values benefits the selection of most vulnerable functions. However, there is a large set of functions falling into the bin with low complexity scores. To distinguish these functions, our vulnerability metrics help to identify the potential vulnerable functions in the bins with smaller complexity-values by selecting the functions with high vulnerability scores. Hence the complexity metrics and vulnerability metrics play complementary roles to find the potential vulnerable functions.
For vulnerable functions, they are spread around different bins with a relatively high vulnerability values. This shows that binning and ranking strategy can indeed help to identify the vulnerable functions with different complexity.
BIND
Binutils
FFmpeg
FreeType
Libav
LibTIFF
libxslt
Linux
OpenSSL
SQLite
Wireshark
p-values for each vulnerability metrics with statistics from 11 benchmark
For each vulnerability metrics, we calculate its average value separately of overall functions and vulnerable functions, and show the ratio in the following table. The average value for vulnerable functions are always larger, which indicates they can lift the ranking of vulnerable functions. We also present the average value of all 11 projects, which ranges from 1.5 to 7.3.
For each vulnerability metrics, as there are 11 projects, we get two lists of 11 values with average values. We use p-value to show one sequence is statistically larger than the other one. p-value is a standard measure used to quantify the idea of statistical significance of evidence, specifically with p<0.05. Obviously, the p-value for each vulnerability metrics is less than 0.05.
Initial Candidate Vulnerability Metrics Examples
We build our vulnerability metrics by extracting the common semantic abstraction from concrete vulnerability cases. We started from a comprehensive list with the intention of the completeness, and examples in initial candidate list are given below. In the end, we chose a subset of them based on the accuracy of the experimental results and reported in our paper.
Also, we find vulnerability metrics for too specific vulnerabilities types work no so well in our task. Once obtaining vulnerability type information about the ground truth, we can also test how B&R works on such task to identify specific vulnerabilities.
Metrics inspired by off-by-one vulnerabilities:
#of constants involved in loop predicates
# of constants involved in control predicates
Metrics inspired by array related buffer overread/overwrite:
- # of array variables
- # of array variables involved in loop predicates
- of array variables involved in control predicates
- # of array Variable
others sensitive operations:
- # of shift operation
- # of bit Operation
- # variables involved in shift operations
- # variables involved in bit operations
Complementary Materials for RQ1: Rationality of Binning before Ranking
BIND
Binutils
FFmpeg
FreeType
Libav
LibTIFF
libxslt
Linux
OpenSSL
SQLite
Wireshark
Complementary Materials for RQ2: Effectiveness of Binning-and-Ranking
Average recall when identifying different percentage of functions as vulnerable
- Best performance are highlighted with bold font and records missing in the manuscript are highlighted with red color.
LEOPARD vs Random Forest
BIND
Binutils
FFmpeg
FreeType
Libav
LibTIFF
libxslt
Linux
OpenSSL
SQLite
Wireshark
LEOPARD vs Gradient Boosting
BIND
Binutils
FFmpeg
FreeType
Libav
LibTIFF
libxslt
Linux
OpenSSL
SQLite
Wireshark
LEOPARD vs Naive Bayes
BIND
Binutils
FFmpeg
FreeType
Libav
LibTIFF
libxslt
Linux
OpenSSL
SQLite
Wireshark
LEOPARD vs SVC
BIND
Binutils
FFmpeg
FreeType
Libav
LibTIFF
libxslt
Linux
OpenSSL
SQLite
Wireshark
Complementary Materials for RQ4: Scalability of LEOPARD
Complementary Materials for RQ5: Application of LEOPARD
Some queries used to calculate complexity and vulnerability metrics
Cyclomatic metrics
To get CFG edge information:
queryNodeIndex("functionId:(id1 id2 idN) AND isCFGNode:True").as("x").outE("FLOWS_TO").id.as("y").select{it.functionId}{it}.groupBy{it[0]}{it[1]}{it.size()}.cap
To get CFG Node information:
queryNodeIndex("functionId:(id1 id2 idN) AND isCFGNode:True").functionId.groupCount().cap
Number of loops
To get loop statements:
queryNodeIndex("functionId:(id1 id2 idN) AND type:(ForStatement WhileStatement DoStatement)").groupBy{it.functionId}{it.id}.cap
Number of nested loops/Max nesting loops
To get loop statements:
queryNodeIndex("functionId:(id1 id2 idN) AND type:(ForStatement WhileStatement DoStatement)").groupBy{it.functionId}{it.id}.cap
To get loop statements' AST children nodes:
idListToNodes([id1, id2, idN]).as("x").astNodes().as("y").select{it.id}{it.id}.groupBy{it[0]}{it[1]}.cap
Parameter variables
queryNodeIndex("functionId:(id1 id2 idN) AND type:Parameter").as("x").children().as("y").select{it.id}{it}
Number of pointer arithmetic & variables involved in pointer arithmetic
To get number of struct pointer member access operations(like a->b):
queryNodeIndex("functionId:(id1 id2 idN) AND type:PtrMemberAccess").functionId.groupCount().cap
To get variables in struct pointer member access operations:
queryNodeIndex("functionId:(id1 id2 idN) AND type:PtrMemberAccess").out("USE").groupBy{it.functionId}{it.code}.cap
To get address operations(like &a) and their variables involved:
queryNodeIndex("functionId:(id1 id2 idN) AND type:UnaryOp").as("op").children().filter{it.type=="UnaryOperator" && it.code=="&"}.back("op").astNodes().filter{it.type=="Identifier"}.code.as("sys").select
To get number of pointer dereference operations(like *a):
queryNodeIndex("functionId:(id1 id2 idN) AND type:UnaryOperator").filter{it.code=="*"}.functionId.groupCount().cap
To get variables in pointer dereference operations:
queryNodeIndex("functionId:(id1 id2 idN) AND type:UnaryOp").as("here").children().filter{it.type=="UnaryOperator" && it.code=="*"}.back("here").out("USE").groupBy{it.functionId}{it.code}.cap
To get relation expressions and their variables involved(to filter ones include pointer variables):
queryNodeIndex("functionId:(id1 id2 idN) AND type:(RelationalExpression EqualityExpression)").as("x").children().filter{it.type=="Identifier"}.code.as("y").select
To get additive expressions and their variables involved(to filter ones include pointer variables):
queryNodeIndex("functionId:(id1 id2 idN) AND type:AdditiveExpression").as("x").children().filter{it.type=="Identifier"}.code.as("y").select
To get increment/Decrement operations(to filter ones include pointer variables):
queryNodeIndex("functionId:(id1 id2 idN) AND type:(UnaryExpression IncDecOp)").groupBy{it.functionId}{it.code}.cap
Number of nested control structures/Max control structures
To get control structures statements:
OR(queryNodeIndex("functionId:(id1 id2 idN) AND type:Condition").in("IS_AST_PARENT"),queryNodeIndex("functionId:%s AND type:ForStatement")).dedup().groupBy{it.functionId}{it.id}.cap
To get control structures statements' AST children nodes:
idListToNodes([id1, id2, idN]).as("x").astNodes().as("y").select{it.id}{it.id}.groupBy{it[0]}{it[1]}.cap
Max control-dependent control structures
To get control structures nodes' ids:
getNodesWithType("Condition").id
To get CDG information to know whether the control structures are control dependent:
queryNodeIndex("functionId:(id1 id2 idN) AND isCFGNode:True").as("x").outE("CONTROLS").as("y").select{it}{it}
Max data-dependent control structures:
To get control structures nodes' ids:
getNodesWithType("Condition").id
To get DDG information to know whether the control structures are Involved with the same var:
queryNodeIndex("functionId:(id1 id2 idN) AND isCFGNode:True").as("x").outE("REACHES").as("y").select{it}{it}
To get var declare statements information to identify those pointer vars that share the same memory:
queryNodeIndex("functionId:(id1 id2 idN) AND type:IdentifierDeclStatement").as("x").uses().code.as("y").select{it}{it} & queryNodeIndex("functionId:(id1 id2 idN) AND type:IdentifierDeclStatement").as("x").defines().code.as("y").select
If statements without else
queryNodeIndex("functionId:(id1 id2 idN) AND type:Condition").parents().filter{it.type=="IfStatement"}.as("x").ithChildren('0').as("y").select{it}{it}
Variables involved in control predicates
To get control structures nodes' ids:
getNodesWithType("Condition").id
To get DDG information to know which var are used by the control structures:
queryNodeIndex("functionId:(id1 id2 idN) AND isCFGNode:True").as("x").outE("REACHES").as("y").select{it}{it}