Web Spam 2011 Datasets
The lack of a benchmark collection of Arabic Web pages is still considered as one of the main problems affecting the research efforts in the field of Arabic Web spam filtering.
The following three datasets of Web spam pages were considered and used in
Wahsheh H., Abu Doush I., Al-Kabi M., Alsmadi I. and Al-Shawakfa E. (2012), Using Machine Learning Algorithms to Detect Content-based Arabic Web Spam, International Journal of Information Assurance and Security (JIAS), 7 (1): 14-24.
Extended Arabic Web Spam 2011 Dataset
Please cite our paper if you use Web Spam 2011 Datasets in your publication.