WEBSPAM-UK2007 Dataset

This dataset available for public researches by Carlos Castillo at http://chato.cl/webspam/datasets/uk2007/, it is within UK domain, collected by laboratory volunteer in University of Milan, the collection labeled by human judges as to whether or not they are spam.

We take a portion of the dataset, around 4000 Web spam site; consist of 2000 Web spam site and 2000 as a non spam Web site, with 5 features depending on the content-based of Websites, which mentioned in our paper "Using Machine Learning Algorithms to Detect Content-based Arabic Web Spam".