Secure Software-Bench

Secure Software-Bech: A CVE-PR dataset

Students: AJ Jones, Vérité Mugabo

Research mentor: Mohammed Latif Siddiq

A code generation model generates code by taking a prompt from a code comment, existing code, or a combination of both. AI-assisted code generation tools, such as GitHub Copilot, are increasingly being adopted by developers. Despite having the capability to generate functional source code, these models output vulnerable code. These models learn from patterns present in their training datasets. Consequently, if these datasets contain flaws or poor coding practices and vulnerabilities, these issues could be replicated in the code generated by the models. Our project aims to benchmark models with real-world vulnerabilities to see their capabilities to fix them. This investigation is crucial because as AI code generation becomes more prevalent in automating mundane tasks within firms, it is essential to ensure that the generated code is not only functional but also secure and free of vulnerabilities. While AI can create functional code, we aim to determine whether large language models (LLMs) can produce secure code and rectify vulnerable code in real-world software engineering challenges.

To achieve our goal this summer, we worked on collecting real-world vulnerable data. To collect data on real-world vulnerabilities, we iterated through an open-source database of Common Vulnerabilities and Exposures (CVE) from GitHub, spanning from 2017 to 2024. We used GitHub REST APIs, followed by manual and automated filtering of unnecessary data from the collected datasets. We then found merged pull requests where the vulnerability was fixed to craft the prompt. We also focused on learning about prompt engineering, OpenAI APIs, and open-source models hosted on HuggingFace. This involved crafting prompts based on merged pull requests and creating an effective environment for testing these prompts with various LLMs. By being able to fix vulnerabilities in real-world software, our project helps prevent potential vulnerabilities from being exploited with malicious intent by hackers.

poster.pdf

Mohammed Latif Siddiq is a Graduate Assistant at the University of Notre Dame, Indiana, USA, under Dr Joanna Cecilia da Silva Santos, Assistant Professor, University of Notre Dame, Indiana, in a broad area of Software Security. His research focuses on Software Engineering, Software Security, Code Generation, and Applied Machine Learning. Visit his Google Scholar profile to see his publications.

Mohammed Latif Siddiq's website

Dr Joanna Cecilia da Silva Santos' research lies in understanding weaknesses and vulnerabilities through empirical studies to devise novel automated techniques for the development of secure software systems, from inception to deployment. Her work draws on program analysis, software verification, and machine learning algorithms to solve software engineering and security problems such as software vulnerabilities detection, reasoning and formal modeling of architectural properties, and software architecture reverse engineering.

Dr Joanna da Silva Santos's website

Page updated

Report abuse