Table 9: Comparison of different prompt designs for the brand recognition task. The parts with red, green, and purple backgrounds represent the task background, answer instruction, and few-shot examples, respectively. The text with a blue background is the logo description specific to the task. The accuracy of the answers improves when we introduce a specific persona for the LLM agent along with answer instructions. The most significant improvement comes from incorporating single-shot examples, which help the LLM provide concise and accurate responses. Single-shot examples already work well for GPT; therefore, there is no need to introduce more examples as they will consume more tokens.
Table 10: Comparison of different prompt designs for the CRP prediction task. The parts with red and purple backgrounds represent the task background and few-shot examples, respectively. The text with a blue background is the webpage description specific to the webpage. When specifying the persona and the definition of credential-taking status, the LLM gradually learns how to determine whether a webpage is credential-taking. However, the classification result remains incorrect until few-shot examples are provided. Additionally, chain-of-thought few-shot examples can guide the LLM to think in a step-by-step manner and output the reasoning chain.