Unexpected Behaviors

Taxonomy of Unexpected behaviors

Here, we introduce the taxonomy of unexpected behaviors on seven widespread LLMs caused by glitch tokens.

Spelling Mistake

They occur when the LLM produces a response that's largely accurate but contains minor spelling errors. In essence, the model captures the intended meaning but slips up in the representation of certain words. For example, when given an input like “cloneembedreportprint”, the Text-davinci-003 outputs “clonenetesla”. This showcases the model's missteps in accurately reproducing word forms, even if the overall context is understood.

Incapacity

Incapability arises when the LLM indicates its inability to complete a given task. Due to the alignment characteristics of LLMs, incapability issues predominantly arise in more advanced models such as GPT-4. Essentially, the model recognizes its limitations and explicitly communicates them instead of attempting to produce a possibly incorrect output. For instance, when prompted with a word with negative emotion “retard”, the GPT-4 responds with “Sorry, but I can not assist with that.”. This exemplifies the model’s self-awareness of tasks it is not designed for and its preference to decline rather than produce potentially misleading information.

Hallucinatory Completion

This phenomenon occurs when the LLM generates an output unrelated or incorrectly related to the input string, effectively “hallucinating” a completion that deviates from the input’s context. For example, when Llama2-7b-chat is tasked with spelling ‘atform’, it incorrectly responds with ‘F-A-R-M-T-B’, illustrating a clear departure from expected behavior. Notably, since the ‘Length’ task should produce only a numerical response, an incorrect length is classified as a hallucinatory completion. This highlights the importance of employing diverse proxy tasks to identify glitch tokens and demonstrates how the model can sometimes produce outputs that are inconsistent with the provided context.

Question Repetition

It is observed when the LLM, instead of processing the given token string, responds by reiterating the query or asking for clarification. It demonstrates the model’s inability to discern or act upon the provided token. For example, when given the string “ NUITKA”, the GPT-4 responds with “You didn’t provide a string to repeat. Could you please provide it?”. This indicates that the model might sometimes seek further input rather than making sense of or using the initial token string.

Random Characters

This symptom occurs when the LLM faces the input with glitch tokens which consist exclusively of non-letter characters. Specifically, upon processing these tokens, LLMs generate outputs with unrelated and arbitrary characters. For instance, when provided with the token string “"?”, Text-davinci-003 responds with a string with random characters “&*^%$#@!” instead of the given string, signifying the model's difficulty in correctly interpreting such tokens.

Distributions of Unexpected Behaviors

Response Length of Glitch Tokens and Non-Glitch Tokens

Page updated

Google Sites

Report abuse