As illustrated in the paper, we rely on the attention scores to locate sensitive tokens in an input prompt, and then apply several semantically-equivalent transformations (derived and extended from the CCTest’s codebase) to iteratively mutate those sensitive tokens until generating AEs.
We use tree-sitter to parse the code. You can refer to tree-sitter. https://tree-sitter.github.io/tree-sitter/
We reuse the CCTest to apply the semantically-equivalent transformations on Python. Readers can first visit its codebase for tutorials. https://sites.google.com/view/cctest-info/main-page
Besides the default mutations CCTest offers, we add the following two new mutation passes in our experiments.
This pass first collects all the potential variables and replaces them with equal expressions
We give a simplified code example on the right side.
This pass first analyses the usage of while/for, and then changes them interactively if applicable.
Noted that not all the while statements can be replaced by for statements.
We give a simplified code example on the right side.