LLM has demonstrated state-of-the-art performance in various code benchmarks. This study aims to investigate if existing detectors can identify content produced by Code Llama, WizardCoder and ChatGPT.
This section will offer the download link to source code, LLM-generated content and 13 Detectors response for AIGC.
All source code about cralwer to commercial detectors and model servicing for opensource code can be found here
https://pb123us.s3.us-east-2.amazonaws.com/NLCCD-LLM/crawler_nlccd_detectors_add.tar
All soource code about finetuning and robustness can be find here
https://pb123us.s3.us-east-2.amazonaws.com/NLCCD-LLM/nlccd_finetune.tar
All NL-CCD data can be find here
https://pb123us.s3.us-east-2.amazonaws.com/NLCCD-LLM/NL-CCD.tar
Following the study design in Section 3, we collect the data generated by Code Llama-34B from three common scenarios in software development. The data collection process is the same as the content generated by ChatGPT. However, when we apply the data filtering process to the content generated by Code Llama in a higher proportion of data being excluded compared to content generated by ChatGPT.
The main reason after we investigated the removed samples is that Code Llama cannot generate the code content when we require it to generate the code given a code generation prompt. Furthermore, Code Llama is easier to refuse to answer the question and provide a response like ‘I apologize for the confusion, but I am unable to ....’ than ChatGPT. Similar responses have been provided on our website