To automatically determine whether the search space for a bug's corresponding patch is large (low-probability) or small (high-probability), we designed an algorithm. The pseudo code is as follows.
Algorithm - Evaluate Patch Probability
Input: b: buggy code, g: ground-truth
Output: 1: patch with low probability, 0: patch with high probability
1: procedure EvaluatePatchProbability(b, g)
2: repairType ← GetRepairType(b, g) // Determine the type of repair (add new lines or modify existing lines)
3: if repairType is 'add new lines' then
4: return 1 // Return 1 if repair type is 'add new lines'
5: modifiedLinesCount ← CountModifiedLines(b, g) // Count the number of modified lines
6: if modifiedLinesCount > 1 then
7: return 1 // Return 1 if more than one line is modified
8: s ← CalculateProbabilityScore(b_line, g_line) // Compute the correctly repair probability from the buggy line to the patch line
9: if s >= threshold then
10: return 0 // Return 0 if probability score meets or exceeds threshold
11: return 1 // Return 1 if probability score is below threshold
This algorithm receives buggy code (b) and its corresponding ground-truth (g) as initial input. At first, it determines whether the repair is adding new code lines or modifying existing code lines(Line 2). The patch will be considered to have low probability if repair is conducted by adding new lines(Line 4), and modifying more than one buggy line also has low probability(Line 7). Then, calculating the probability score between the buggy line and the patch line(Line 8). If the score is larger than or equals to the threshold, the patch is thought to have high probability.
Here are the steps to calculate probability score:
1. Clearly identify the changes between the two code lines, including both operators and identifiers (with particular attention to class names, variable names, and function names).
2. Calculate probability score for each changed part separately. For operators, if x > y is a patch for the buggy line x < y, the patching space might include operators such as <, ==, !=, >=, and <=. In this simplified example, the probability score might be 1/5. As for identifiers, we can use the similarity score (cosine similarity from the embeddings) between the two different identifiers to represent the probability score. The larger similarity score is, the smaller search space will be, thus leading to a larger probability score.
3. Calculate the overall probability score, we use certain metrics to measure the differences between parts among the two code lines, e.g., edit distance d, and the overall probability score is represented as the sum of score (distance weight multiples probability score) for each change.