The detailed ratings of the specifications generated by SpecGen for all test cases in SpecGenBench are presented as follows. The test cases that SpecGen cannot handle are assigned a lowest rating of 1.00. We mark the ratings in different intervals with different colors: green (4.5~5.0), light green (3.5~4.5), yellow (2.5~3.5), orange (1.5~2.5), and red (1~1.5).
To conclude, in most cases, the specifications generated by SpecGen can effectively grasp the semantic information of the input program. The detailed results of SpecGen, Houdini, and Daikon for all test cases can be accessed here.