Given the sensitivity of the illicit listings, here we only provide the ground truth dataset used for IDLLSpread Algorithm. The dataset is divided into the training set and test set, whose sizes are 4,971 and 2,144, in total of 7115, respectively. All data has been labeled and checked manually. The data has not gone through filtering (excluding the legal listings which have been approved by the governments or related commissions to sell some kinds of drug which have decriminalized or legalized in some states). [Download]
Name: Business name signed in the listing;
Label: Annotated label (0: benign, 1: malicious);
Address: Street address signed in the listing;
City: City signed in the listing;
State: State signed in the listing;
Zipcode: Zipcode signed in the listing;
URL: Business storefront URL signed in the listing;
Telephone: Business storefront telephone number signed in the listing;
Category: Category signed in the listing;
Description: Description in the listing;
Payment: Payment method signed in the listing;
Broker: Broker where the listing is found.
The promotional terms is published here [Download].