The basic idea of SRC extraction is to statically iterate the loop to compare the symbolic states. The key method is to find a revisit state within a lasso. However, there could be an infinite number of lassos during the symbolic execution. To address this challenge, we adopted two strategies: path summarization and state pruning.
Path Summarization: Given a loop path, if it is an inductive path, we make a summarization for this path. Otherwise, if it is non-inductive, we perform a standard symbolic execution that does not involve path summary. It updates the current trace condition to reflect the transition from executing the current path k times to the next path, and then checks the satisfiability of the updated trace condition.
State Pruning: To prevent state explosion during the symbolic execution, caused by the cyclic execution between multiple path, we limit the number of iteration for each cycle of execution by only considering one cycle of execution at a time. Specifically, if symbolic execution encountered a state which did not visited before, it create a new symboic state for this path. Otherwise, for the visited path, we do note make new symbolic states.
SRC Inference: If a lasso is found during the execution, we infers a sufficient condition for the non-termination of the identified lasso. We calculate the SRC by ensuring that the values of the variables within the lasso, i.e., X=X'. Then, we eliminate the existentially quantified variable to get SRC.
As shown in the following program, we mainly focus on the loops in a program, even the loop is with a complext context. For each loop, we analysis its loop paths and make summary for the inductive path.
Program:
CFG:
Paths:
Path Summaries:
During the symbolic execution, the lassos can be detected as follows. Note that, our state pruning strategy limits the number of iterations for each cycle of execution by only considering one cycle of execution at a time. By this approach, the infinite lassos like τ1 → τ2 → τ1 → τ2 → ... are forbidden. The feasible lassos are listed as follows.
Lassos:
For each lassos, we can infer out a sufficient precondition by the path conditions, path transitions and state revisit. Then, we use a smt solver to check the satisfiabilites. After quantifier elimination (QE), we can eventually get the SRC, i.e., i= {49, 50}.
SRC Inference:
To enhance our SRC based approach and also to make it more scalable in non-linear loop, we provide several optimization strategies, including stuck path detection, non-inductive path summarization, and over flow handling.
Stuck paths are paths that cannot transit to any other paths, and they may not cause state revisits in a finite execution. If the loop enters a stuck path, it does not terminate. To detect stuck paths, we check whether the path condition θτ can always be satisfied for any number of iterations using a universal quantifier k, as shown below:
A non-inductive Example for stuck path
The following program includes a non-linear infinite loop, which is from the OSS_Bench. We can find that the path (a, b, c, e, g, a), which marked in red, cannot trigger any variable changes. Therefore, it is a stuck path and the execution cannot escape from it. To identify the reachability of this path, we adopt the post condition of the path as shown in the following figure. The boolean variable r is the recorder of the post conditon. If r is true, the stuck path has been reached and a non-termination is found.
Program
CFG
Instrumentation
An inductive Example for stuck path
In this example, the variable x is increased by 1, , while the path condition x > 0 can terminate if the value of x surpasses the integer's maximum limit and causes overflow. Nonetheless, most of the existing verification tools report "non-termination" because they do not consider overflow. Therefore, in static analysis, EndWatch also provide the mode which do not consider overflow, and take it as a stuck path by checking the corresponding path cannot transit to any other paths.
Bit shifting is a non-linear operation, but in certain situations, it can result in a variable that is unchanged or has an upper/lower bound. These conditions can be used to calculate the path summary. For example, if the initial variable 𝑥 is 0, no matter how many times it is shifted left or right, it will always be 0. If a positive value is shifted right iteratively, it will eventually become 0. If a negative value is shifted right iteratively, it will eventually become -1. For instance, in the left loop, if the initial value of x is positive or zero, the summary on x will be x′ = 0, and if its initial value is negative, the summary on 𝑥 will be x′ = -1. Therefore, by assigning x as x >= 0, we can find the loop is non-termination, and the corresponding SRC is x >= 0.
Function calls can be difficult to summarize, but some APIs have specific return values that can be used as the summary. For example, the read function is often used in loops to read the contents of a file. Although its return values are not linear, it is known that the read function will return 0 if it has reached the end of the file (EOF), especially when the function is invoked in a loop. In the left loop, the summary on ret, count, and buf could be ret'=0, count'-= 0∗𝑘, and buf' += 0∗𝑘, as read will eventually return 0 after it reaches EOF. Finally, we can get the SRC, i.e., {ret=0}.
Overflow happens when a value exceeds the maximum or minimum value that can be represented by its data type, and is then wrapped around to the other end of the range. Existing techniques usually assume that overflow does not occur, which can lead to incorrect conclusions in real-world program execution. Therefore, in endwatch, overflow is optional. Overflow handling is incorporated into the calculation of path summary. Consider the left loop, the summary without considering overflow is n′ = n − 16𝑘. However, when considering overflow, the summary becomes n′ = (n−16 𝑘) mod 2^16, which takes into account the possibility of x overflowing and wrapping around.