This section aims to give a detailed illustration of the prompt formats employed to invoke the subject LLMs. As described in the paper, four different tasks related to program specifications are involved in the experiments. The prompts for each type of task share a similar overall structure, including system configuration, few-shot examples, task description, and model answer. Nevertheless, the details of each component involved vary between different tasks.
The Judgment task requires LLMs to decide the correctness of a given candidate specification according to the target program. Therefore, the target program and the candidate specification are provided in the prompt. The LLMs are required to give a straightforward answer concerning the correctness of the candidate.
Fig. 1. Prompt format for Judgment task.
The Selection task requires LLMs to choose from the given four candidate specifications the most appropriate one according to the target program. To this end, the target program and the four candidate specifications are provided in the prompt. The LLMs are required to give a straightforward answer concerning index of the chosen candidate.
Fig. 2. Prompt format for Selection task.
The Infilling task requires LLMs to fill in an uncompleted specification within a target program. To this end, the target program with the specification to fill in is provided in the prompt. The LLMs are required to generate the infilled specification along with the original program.
Fig. 3. Prompt format for Infilling task.
The Generation task requires LLMs to generate full specifications for the target program from scratch. To this end, only the target program itself (without any specifications) is provided in the prompt. The LLMs are required to generate the specifications instrumented in the program.
Fig. 4. Prompt format for Generation task.