On Extracting Specialized Code Abilities from Large Language Models: A Feasibility Study

In this paper, we experimentally investigated the effectiveness of extracting specialized code abilities from LLMs using common medium-sized models. To do that, we designed an imitation attack framework that comprises query generation, response check, imitation training, and downstream malicious applications. Our evaluation showed that the generated imitation models can achieve comparable performance to or even outperform the target LLM APIs, as well as provide useful information to facilitate the generation of adversarial examples against LLMs. We summarized our findings and insights to help researchers better understand the threats posed by imitation attacks

Scripts: https://1drv.ms/f/s!Ak-2isnRCTgnglX0GgJEtdZXIA4a?e=PXgbV6