RQ2: Can LLM understand code static behaviors?

CFG Result

Reasonable VS. Unreasonable

Issue Distribution

Call Graph Result

Reasonable VS. Unreasonable

Issue Distribution

One Interesting Call Graph Case about Polymorphism (Wrong type Inference and Fabrication ) from GPT3.5

Data Dependency

Taint Analysis

Pointer Analysis

Jaccard Index of Each Pointer

CFG Result

Similar with RQ1, the experts evlaute the generated CFG and tolerate the minor issues. LLM has some ability to understand control flow graph.

Reasonable VS. Unreasonable

Issue Distribution

Call Graph Result

Similar with RQ1, the experts evlaute the generated CG and tolerate the minor issues.

Our observations indicate that LLM tends to fabricate call relationships, resulting in call graphs of generally low quality. Recognizing the importance of call graphs in analyzing the security of an entire program, we interpret our findings to suggest that the code generated by ChatGPT often harbors vulnerabilities. This is likely due to the lack of static analysis, which could ensure the reliability of the code during the generation process.

[1] Khoury, Raphaël, et al. "How Secure is Code Generated by ChatGPT?." arXiv preprint arXiv:2304.09655 (2023)

Reasonable VS. Unreasonable

Issue Distribution

"edges": [

{ "source": "AnimalInheritanceTest.main(String[])",

"target": "Dog.<init>(String)" },

{ "source": "Dog.<init>(String)",

"target": "Animal.<init>(String)" },

{"source": "AnimalInheritanceTest.main(String[])",

"target": "Animal.sound()"},

{ "source": "AnimalInheritanceTest.main(String[])",

"target": "Dog.getName()" },

{"source": "Dog.getName()",

"target": "Animal.getName()" },

{ "source": "AnimalInheritanceTest.main(String[])",

"target": "Dog.sound()" } ]

One Interesting Call Graph Case about Polymorphism (Wrong type Inference and Fabrication ) from GPT3.5

We declare and define an object of the Dog class with its superclass, Animal d1 = new Dog("1");. Then, we call d1.sound(). GPT3.5 considers this call as the sound() function in the Animal class. For polymorphism, ChatGPT fails to handle this case due to incorrect type inference.

Additionally, GPT3.5 fabricates two function calls related to getName().

The code is provided below, with the left part displaying the Call Graph and GPT3.5's output.

··· Code

public class Animal {

private String name;

public Animal(){}

public Animal( String name){

this.name = food;

}

public String getName() {

return this.name;

}

public void sound() {

System.out.println("animal");

}}

public class Dog extends Animal {

private String name;

public Dog(){}

public Dog( String name){

this.name = food;

}

public String getName() {

return this.name;

}

public void sound() {

System.out.println("Dog");

}}

public class AnimalInheritanceTest {

public static void main(String[] args) {

Animal d1 = new Dog("1");

d1.sound();

Dog d2 = new Dog("1");

d2.sound();

}}

···

Data Dependency

"A data dependency in computer science is a situation in which a program statement refers to the data of a preceding statement. In compiler theory, the technique used to discover data dependencies among statements is called dependence analysis." form Wiki. We provide all experimental data in the above link.

Taint Analysis

Taint Analysis is a software security technique used to identify parts of a program that may be affected by unsafe input. This process is often used to detect security vulnerabilities such as injection attacks or cross-site scripting (XSS). In Taint Analysis, the flow of data from untrusted sources (such as user input) to critical parts of the program (such as database queries or file systems) is traced and analyzed. We provide all experimental data in the above link.

Pointer Analysis

Pointer Analysis is a programming analysis technique used to determine the objects that pointer variables in a program may point to. This is a key component of compiler optimization and program understanding, especially when dealing with programs that deal with complex data structures and pointer manipulations. The main purpose of Pointer Analysis is to improve the performance and security of the program. By accurately analyzing the behavior of pointers, the code can be better optimized and errors and vulnerabilities can be reduced. We provide all experimental data in the above link.