We present 74 error patterns in details that are summarized from DeepFix, TRACER and StackOverflow. We further provide one more section to illustrate our constructed dataset is more diverse than DrRepair.
By the in-depth analysis of four experts for six months on the data from DeepFix, TRACER and StackOverflow, we can obtain 74 errors in total. We further categoried them into syntax errors and semantic errors according to the compilation process. For syntax errors, it can be further divided into "Structure" error and "Statement" error that based on the scope of influence that error occurs. For semantic errors which mainly focus on violating program semantics, they can be further divided into “Variable declaration”, “Type mismatch” and “Identifier misuse”. The detailed error descriptions for 74 compilation errors are listed as follows and we further provide the corresponding GCC error message for each error pattern.
Syntax error: Syntax error denotes the errors that are mostly caused by some grammatical issues. We classify them into “Structure" and “Statement" according to the scope of influence that the error occurs;
Structure error (struct) defines that missing punctuator(s) (e.g., “{”, “}”, “;”) in a statement or a block
Statement error (stmt) is due to the mistaken tokens in labeled statement, expression statement, selection statement, or iteration statement
Semantic error mainly focuses on the semantic failures of a program
Variable declaration (decl) represents the use before the variable is declared
Type mismatch (tm) defines the mismatch of the type or the number of formal parameters of a function
Identifier misuse (im) means performing wrong operation on an identifier
To analyze the diversity of the generated datasets provided by DrRepair and TransRepair, we conduct a statistical analysis on five different categories (i.e., struct, stmt, decl, tm and im) and obtain the box-plot for each dataset as follows:
( a ) DrRepair
( b ) TransRepair
We can observe that in Figure (a) DrRepair has nine outliers with the highest value of nearly 500,000 in the constructed dataset while TransRepair only has five outliers in Figure (b) with the highest value 270,000. These outliers in DrRepair account for nearly 76.37% of the dataset (1,870,750 in total), while TransRepair only accounts for nearly 41.53% (1,821,275 in total samples). Furthermore, we can see that the number of samples for decl and tm are extremely lower than other categories for the dataset that DrRepair used. In contrast, the number of samples for different categories in the dataset provided by TransRepair is relatively more balanced. Hence, we can conclude that our constructed dataset is more diverse than DrRepair's.