Online Services
iArabicWeb16
iArabicWeb16
iArabicWeb16 is a freely-available Web-based tool making ArabicWeb16 dataset more accessible to the research community via both Web interface and programming API.
- Web Search interface: allows the registered users to perform interactive search similar to commercial search engines.
- Programming API: a Java client API which enables developers or researchers to perform search operations with different configurations and retrieve documents directly using their IDs within their programs. The API provides 3 main functions:
- search: enables the users to issue a specific query on the collection with the specified configuration (e.g., ranking function, number of returned results, etc.). It returns a string in JSON format that can be parsed to an array of results using parseResults function (not shown in the figure).
- retrieveSingleDoc: returns the document with the given ID.
- retrieveBatchOfDocs: writes the content of the requested documents in the destination file specified by the user in a compressed format.
For details, check our LREC/OSACT3 paper:
Khaled Yasser, Reem Suwaileh, Abdelrahman Shouman, Yassmine Barkallah, Mucahid Kutlu, and Tamer Elsayed. iArabicWeb16: Making a Large Web Collection More Accessible for Research. Proceedings of the 3rd Workshop on Open-Source Arabic Corpora and Processing Tools (OSACT 2018) at LREC 2018, pp. 75-79, Miyazaki, Japan, May 2018.
Get access to iArabicWeb16 here.