GlitchProber stands out by addressing the glitch token phenomenon in LLMs without altering the intrinsic capabilities of these models. Unlike traditional fine-tuning methods, GlitchProber modifies the model’s calculation process, ensuring that the fundamental parameters and functionalities remain intact.
By constructing a dataset with questions and answers for the repetition task, we attempt to mitigate the glitch token phenomenon by fine-tuning LLMs. However, fine-tuning LLMs alters the parameters of the model, potentially compromising its basic abilities. For example, we fine-tune the Llama-2-7b-chat with a dataset containing 3,000 Q&A pairs of repetition tasks. To evaluate the model's basic skills, we use three widely accepted datasets: GSM8K, HumanEval, and MMLU.
Results presented above indicate that the model's ability in code writing and solving math problems post fine-tuning is significantly diminished compared to the original model. Specifically, after fine-tuning, the model displayed a notable reduction in its performance on tasks requiring advanced cognitive skills, such as code writing and mathematical problem-solving. The results on three established benchmark datasets clearly indicated a decline in the model’s effectiveness in these areas compared to its pre-tuned state.
In contrast, GlitchProber applies a nuanced approach that modifies the process of model calculation without altering the model parameters, thereby preserving the basic abilities of LLMs. This method not only addresses glitches but also preserves the model's original abilities across various domains.