ChatGPT Data Analysis
ChatGPT Data Analysis
The purpose of our study was to expand the limited research on ChatGPT's capacity to improve code quality. ChatGPT, launched in late 2022, is a relatively new language-processing chatbot designed to generate natural dialogue. ChatGPT has gained widespread popularity in countless environments and among various kinds of users. As it extends into the realm of writing and refactoring code, we aim to assess how effectively it can accomplish different programming tasks in response to various types of prompts.
Large language models (LLMs) have become an increasingly popular tool in all aspects code development. By now, the capabilities of LLMs in code production and refactoring have been substantially researched. However, many of these assessments were carried out in controlled research environments. This indicates that there is a lack of research done on the practical use of LLMs by developers in real-world conditions. In an effort to close this gap, we undertook an empirical study based on interactions within DevGPT, a dataset of real developer projects and corresponding ChatGPT Share Links. Using this dataset, our study aimed to further the preliminary research on how developers can use ChatGPT to refactor code effectively and efficiently. To do so, we focused on the extent to which ChatGPT was helpful, the prompt that gives a successful answer in the fewest interactions, and the programming languages that ChatGPT is most effective in. We found that, overall, ChatGPT provided developers with code that they could directly utilize in their projects more often than code that needed modifications, signifying ChatGPT's promising potential in being a polishable tool for developers. Our prompt taxonomy suggested that prompts requesting code diagnosis, explanations, or code generation often preceded lengthier conversations when compared to prompts that defined ChatGPT's role or prompts with direct code snippets. Finally, ChatGPT demonstrates varying levels of proficiency across different programming languages, with CSS receiving the most effective support. These insights provide valuable guidance for developers aiming to optimize their use of ChatGPT for code refactoring. We envision a time when developers have refined prompt strategies and improved language support, seamlessly adopting LLM refactoring into project development.
Our Team Presenting at the Stevens Symposium for Undergraduate Research: