All our artifacts (datasets, models, and OOD detectors) are released under specific licenses and archived on GitHub. Concretely, the raw source code files in Java250-S and Python800-S are taken from CodeNet under an Apache License 2.0 license. All our created datasets with distribution shifts in these two collections are under a CC0 license. The Python75 dataset, models, and OOD detectors are all under a CC0 license.
All CNN (Sequence) and MLP (Bag) relevant experiments were conducted on a 3.00 GHz Intel Xeon Gold 5217 CPU with two RTX 8000 GPUs. The implementation was based on the Tensorflow 2.5.1 framework. All pre-trained language model-based experiments were performed on a high-performance computer cluster consisting of Dell C4140, 6 GPU nodes x 4 Nvidia V100 SXM2 32GB. We implemented the related experiments based on the PyTorch 1.6.0 framework.
All experiments were conducted on a Linux server equipped with Intel Xeon Silver 4114 CPUs (20 cores, 40 threads) and an NVIDIA Tesla V100 GPU (16GB VRAM). The system utilized 187 GiB RAM and ran Ubuntu 20.04.4 LTS. Software included Python 3.11.7, PyTorch 2.3.0, and CUDA 12.2.
The online resource includes our crawled raw data from AtCoder, the original datasets of Java250 and Python800 downloaded from CodeNet and data descriptions. We also release the corresponding datasets (source code files and tokens) with distribution shifts of Python75, Java250, and Python800. The token representations are generated using the tokenizer tool provided by CodeNet.
CNN (Sequence) and MLP (Bag) are trained using the implementation by CodeNet with the same parameters (e.g., random seed, dropout rate). The training set and ID test are involved in the training procedure for training and validation, respectively. The epoch number is set to 100. All the trained CNNs are released on GitHub. Pre-trained models are fine-tuned using the implementation with the same parameters.
We modify the original implementations of the ODIN and Mahalanobis detectors to fit the TensorFlow framework. The OE detectors have the same DNN architectures and we take the loss function from the original implementation. The implementations of MSP, ODIN, Mahalanobis, and all the OE detectors are released on GitHub.
In addition, we introduce Primevul to CodeS+, a high-quality vulnerability detection dataset designed to provide a more realistic evaluation benchmark for Code Language Models (Code LMs) in software vulnerability detection tasks. We extend vulnerability detection tasks into CodeS+.