In the design process of our prompts, various approaches were employed to optimize the performance of LLM for the given tasks. We also had to consider the trade-off between the input length and the output length due to the model's limitations in sequence length handling. Our testing revealed that in certain scenarios, the inclusion of few-shot examples potentially diminished the model's performance. We hypothesize that this might be due to the few-shot examples inadvertently constraining the model's creative capabilities. Nevertheless, if we prompt with less information, the outputs can occasionally be unpredictable. The process of designing prompts for domain-specific tasks significantly benefits from prior experience. The below shows the prompts we used.
You are a C Abstract Syntax Tree (AST) parser. I will give you a C code file. You give me its AST in Json format. Each AST node only has three attributes, children, type and value.
The input file is
```
[INPUT_CODE]
```
You are an AI trained to detect similar code expressions. Given a Smart Contract code and a specific target code expression, your task is to find and list the most similar expressions within the provided Smart Contract code. I will show you the answer format and then please analyze the new input following code file and search for expressions that closely resemble the target code piece provided.
```
{"Answer":"Yes" or "No", "similar_expressions": [
{
"function_name": the matched funciton name,
"line_number": line_number,
"expression": the similar code
}
]
"Reason": your reason
}
Input Smart Contract Code:
```Solidity
[INPUT_CODE]
```
Input Specific Target Code Expression:
```Target Expression
[INPUT_EXPRESSION]
```
Please identify the similar expressions, their corresponding function name and their corresponding line numbers in the code file. You also need to replace the function calls "add", "sub", "div", "mul", "divCeil" in the found similar expressions with "+", "-", "/" and "*". Put your results in JSON format at the beginning.
You are a call graph analyzer for [LANG] . I will give you a [LANG] program and you tell me its call graph. The output format is json file, including nodes and edges.
The input file is
```
[INPUT_CODE]
```
You are a control flow graph analyzer for [LANG] . I will give you a [LANG] program and you tell me its control flow graph. The output format is json file, including nodes and edges.
The input file is
```
[INPUT_CODE]
```
You are a helpful code program analysis tool for Smart Contract. You analyze the Solidity contract code and classify if two variables or contract states have a data dependency relationship. The labels you use are 'yes', 'no' and 'unknown'. 'yes' means they are data dependent. 'no' means they are not data dependent. Otherwise, they are labelled 'unknown'. You first give the label and then explain the reason.
The code is
```
[INPUT_CODE]
```.
You first give the label and then explain the reason. Please answer the following question: is the variable [VAR_NAME] in the function [FUNCTION_NAME] data depended on the variable [VAR_NAME] in the function [FUNCTION_NAME] ?
You are a helpful code program analysis tool for Smart Contract. You analyze the Solidity contract code and classify if the vairbale or contract state is controlled by the user. The labels you use are 'yes', 'no' and 'unknown'. 'yes' means it is controlled by the user. 'no' means it is not controlled by the user. Otherwise, it is labelled 'unknown'. You first give the label and then explain the reason.
The code is
```
[INPUT_CODE]
```
You first give the label and then explain the reason. Please answer the following question: is the variable [VAR_NAME] in the function [FUNCTION_NAME] is controlled by the user?
You are a pointer analysis tool for C programs. I will provide a C file to you and you do the pointer analysis about it. You analyze what varaibles the pointers points to in the provided code. The code is
```
[INPUT_CODE]
```
Please provide you answer in Json format that inlucdes the list of the variable names each pointer points to:
Please analyze the two following provided code files in C or Java. Identify if they are semantically equal. 'Semantically equal' means two codes have the same meaning, that they have the same output given the same input. Here are three semantically equal examples:
The first example pair is
``` Code 1
double f(double M, double x) {
x = (M + x) / 2;
return x;
}
```
``` Mutant Code 1
double f(double M, double x) {
x = (M + x++ ) / 2;
return x;
}
```
Yes. The two codes are semantically euqal because `M + x++` first does `M + x` and then `x++`. Therefore, `(M + x) / 2` is the same with `(M + x++) / 2`.
Please identify if the two following codes are semantically equal. Please only answer `yes` or `no`. `yes` means they are semantically equal. `no` means they are not.
Input :
```Code
[INPUT_CODE]
```
Please analyze the following provided test code in Java. Identify the reason why it is flaky test. 'Flaky test' means one test sometimes pass and sometimes fails. There are 13 reasons about the flaky test.
Here are the definitions of 13 flaky test reasons:
Reason 1, async wait. We classify a commit into the Async Wait category when the test execution makes an asynchronous call and does not properly wait for the result of the call to become available before using it.
Reason 2, test order dependency. We classify a commit into this category when the test outcome depends on the order in which the tests are run.
Reason 3, time. Relying on the system time introduces non-deterministic failures, e.g., a test may fail when the midnight changes in the UTC time zone. Some tests also fail due to the precision by which time is reported as it can vary from one platform to another.
Reason 4, IO. I/O operations (in addition to those for networks) may also cause flakiness.
Reason 5, concurrency. We classify a commit in this category when the test non-determinism is due to different threads interacting in a non-desirable manner (but not due to asynchronous calls from the Async Wait category), e.g., due to data races, atomicity violations, or deadlocks.
Reason 6, network. Tests whose execution depends on network can be flaky because the network is a resource that is hard to control. In such cases, the test failure does not necessarily mean that the CUT itself is buggy, but rather the developer does not account for network uncertainties. F
Reason 7, resource leak. A resource leak occurs whenever the application does not properly manage (acquire or release) one or more of its resources, e.g., memory allocations or database connections, leading to intermittent test failures.
Reason 8, randomness. The use of random numbers can also make some tests flaky. In the cases that we analyzed, tests are flaky because they use a random number generator without accounting for all the possible values that may be generated.
Reason 9, unordered collections. In general, when iterating over unordered collections (e.g., sets), the code should not assume that the elements are returned in a particular order. If it does assume, the test outcome can become non-deterministic as different executions may have a different order.
Reason 10, test case timeout. Flaky tests experiencing non-deterministic timeouts related to a single test belong to this category. It is comparable to the Test Suite Timeout (reported later), with the difference that the size of a single test grew over time without adjusting the max runtime value.
Reason 11, too restrictive range. In this category, some of the valid output values are outside the assertion range considered at test design time, so the test fails when they show up. In other words, such test cases have a range of predefined values for which the test is allowed to pass; if this range is defined too restrictively, tests may start failing in a not deterministic way.
Reason 12, floating point operations. Dealing with floating point operations is known to lead to tricky non-deterministic cases, especially in the high-performance computing community [6]. Even simple operations like calculating the average of an array require thorough coding to avoid overflows, underflows, problems with non-associative addition, etc. Such problems can also be the root cause of flaky tests.
Reason 13, platform dependency. In many ways, it is possible for an execution to differ because of platform dependencies, for example because the size of an object (which is accessible via sys.getsizeof) differs between 32-bit and 64-bit systems
Please identify the reason why the following code is flaky. Your answers are from `async wait`, `test order dependency `, `time`, `IO`, `concurrency`, `network`, `resource leak`, `randomness`, `unordered collections`, `test case timeout`, `too restrictive range`, `floating point operations` and `platform dependency`.
Input :
```Code
[INPUT_CODE]
```
We take AST analyser as an example. In customizing the " AST analyser" GPT model, we used the "Task-specific Prompt" strategy, which means we designed additional prompt step by step. We first asked it focused on creating a specialized tool for analyzing abstract syntax trees (AST) of programming languages like C, Java, Python, and Solidity. Then, we asked it to ensure the model could handle queries with ambiguity by providing the best possible responses based on available information. Besides, in order to have an accurate answer, a prompt is added to limit the output in JSON format, making the data structured and easy to parse. Each AST node was to include attributes such as type, value, and children, enhancing the model's functionality to generate detailed and structured ASTs for code snippets or files. This strategy reflects a focus on technical specificity while maintaining ease of use and interaction for users.