The demo of the feature database can be downloaded here. We only released 100,000 features. The remaining features will be released after the paper is accepted.
Here shows the data schema of one feature. There are several fields, including:
_id: the index of the feature, used in MongoDB
cc: the value of Cyclomatic Complexity
count: the number of projects with this feature
func_hash: the MD5 Hash of the feature
hv: the value of Halstead Volume
languages: the language used to write this function
line_of_code: the value of line of the code
location: an array, where each element is a key-value pair, the key is the name of the project that has this function, and the value is an array that contains the commit hash of the feature in the project.