We categorized the 204 programs within our dataset into 5 categories according to the control-flow structures involved in them. Among the programs, 37 are sequential (i.e., no branches or loops), 77 are loop-free but with branches (i.e., if-else or switch statements), 36 have single-path loops (i.e., loop body is sequential), 34 have multi-path loops (i.e., loop body is branched), and 20 have nested loops. The performance of each model on all tasks (categories by program control-flow structures) is listed below.
Generally, the most easily handled programs are sequential, since the semantics of such programs are usually straightforward. The branched programs exhibit unexpected difficulties in Selection and Infilling tasks. This possibly reveals that models cannot effectively synthesize the semantics of multiple nested if-else structures. The programs with loops are harder to handle regarding Generation tasks compared to loop-free programs. The most complicated programs are those with nested loops, which is a pretty intuitive finding since the nested loops significantly increase the complexity of the corresponding loop invariants, resulting in v