ClueWeb12 Contextual suggestion Subcollection

ClueWeb12 CS (Contextual suggestion Subcollection): Contains 30 144 documents. This subcollection was created by issuing a variety of queries targeted to the Contextual Suggestion track on a commercial search engine. Returned results that were part of ClueWeb12 were included in the subcollection. The file below is a JSON file which contains a list of documents in the subcollection. It has a list of contexts names (city-state) to array of document object pairs. Each document object contains a URL and the corresponding ClueWeb12 DocID.

Also, a set of WARC (one per context), that contain the documents in this subcollection, are available here: http://lemurproject.org/clueweb12/related-data.php.