Our shared task can be contextualized within an international research effort towards the detection of natural language free-text generated with Large Language Models (LLMs).
This page lists some relevant competitions and related research which have already addressed aspects of this complex issue.
Competitions
This shared task will take place as part of IberLEF 2023, the 5th Workshop on Iberian Languages Evaluation Forum at the SEPLN 2023 Conference, which will be held in Jaén, Spain on the 26th of September, 2023.
This competition was a part of the shared task hosted within the third workshop on Scholarly Document Processing (SDP 2022), being held in association with the 29th International Conference on Computational Linguistics (COLING 2022). This blog post reflected on the complexity of the problem.
For this competition, contestants had to attribute synthetic text written by fine-tuned language models back to the base LLM, with the aim to establish new methods that provide strong evidence of model provenance. The resulting paper showed that contestants used both manual and statistical solutions, with the manual solutions proving superior.
Related research
This paper shows that several AI-text detectors are not reliable in practical scenarios. The authors empirically show that paraphrasing attacks, where a light paraphraser is applied on top of a LLM, can break a whole range of detectors, including ones using watermarking schemes.
This paper proposes a methodology for developing and evaluating ChatGPT detectors for French text, with a focus on investigating their robustness on out- of-domain data and against common attack schemes. Results show that the detectors can effectively detect ChatGPT-generated text, with a degree of robustness against basic attack techniques in in-domain settings. However, vulnerabilities are evident in out-of-domain contexts, highlighting the challenge of detecting adversarial text. The study emphasizes caution when applying in-domain testing results to a wider variety of content.
This study reveals that, with the aid of carefully crafted prompts, LLMs can effectively evade detection systems. The authors propose a novel Substitution-based In-Context example Optimization method (SICO) to automatically generate such prompts. On three real-world tasks where LLMs can be misused, SICO successfully enables ChatGPT to evade six existing detectors, causing a significant 0.54 AUC drop on average. Surprisingly, in most cases these detectors perform even worse than random classifiers. These results firmly reveal the vulnerability of existing detectors. Finally, the strong performance of SICO suggests itself as a reliable evaluation protocol for any new detector in this field.
This paper was the first survey of its kind and addresses several important research challenges.