TREC 2016 Contextual Suggestion Guidelines

The Contextual Suggestion Track investigates search techniques for complex information needs that are highly dependent on context and user interests.

Background

According to a report from the The Second Strategic Workshop on Information Retrieval in Lorne (published in the SIGIR Forum, June 2012): “Future information retrieval systems must anticipate user needs and respond with information appropriate to the current context without the user having to enter an explicit query... In a mobile context such a system might take the form of an app that recommends interesting places and activities based on the user’s location, personal preferences, past history, and environmental factors such as weather and time...  In contrast to many traditional recommender systems, these systems must be open domain, ideally able to make suggestion and synthesize information from multiple sources...”

For example, imagine a group of information retrieval researchers with a November evening to spend in beautiful Gaithersburg, Maryland.  A contextual suggestion system might recommend a beer at the Dogfish Head Alehouse (www.dogfishalehouse.com), dinner at the Flaming Pit (www.flamingpitrestaurant.com), or even a trip into Washington on the metro to see the National Mall (www.nps.gov/nacc).  The goal of the Contextual Suggestion track is to provide a venue for the evaluation of such systems.

Timeline

Task Summary

For this task participants are asked to develop a system that is able to make suggestions for a particular person (based upon their profile) with a particular context. Details of what the profiles and contexts contain are given below. Each profile corresponds to a single user, and indicates that user’s preference with respect to each example suggestion.  For example, one suggestion might be to have a beer at the Dogfish Head Alehouse, and the profile might include a negative preference with respect to this suggestion.  Each training suggestion includes a title, description, and an associated URL.  Each context corresponds to a particular geographical location (a city).  For example, the context might be Gaithersburg, Maryland.

For each profile/context pairing, participants should return a ranked list of up to 50 ranked suggestions. Each suggestion should be appropriate to the profile (based on the user’s preferences) and the context (according to the location). Profiles correspond to the stated preferences of real individuals, who will judge the proposed suggestions. Users are primarily recruited through crowdsourcing sites.

Participants will return results either by participating in the Phase 1 experiment or by submitting during the Phase 2 / batch experiment.

Contexts

Each context will consist of a mandatory city name which represents which city the trip will occur in and several pieces of optional data about the trip.

Profiles

Profiles consist of a list of attractions the user has previously rated. For each attraction the profile will include:

A rating:

Additionally the user may annotate the attraction with tags that indicate why the user likes the particular attraction. For example the user may add the tag "Shopping" to an attraction. You can see the full set of possible tags here: https://github.com/akdh/cst-tools/blob/master/tags.csv

The profile may optionally also include the user's age and gender.

Collection

The collection consists of a set of attractions. For each attraction there is:

The collection is available for download here: http://145.100.59.205:8095/TREC2016_CS_Collection.zip

Duplicate documents: Note that there are duplicate documents in the collection. In these cases you may return any of the attraction IDs as long as it is valid for your request (it is in the right context).

Additional data

A crawl of (almost) all the pages in the collection is provided.  In order to have access to the data designated as the TREC CS Web Corpus, organizations (who have not already done so) must first fill in a data release Organizational Application Form. The signed form must be scanned and sent by email to data@list.uva.nl. On receipt of the form, you will be sent information on how to download the data.

Access to the data by an individual person is to be controlled by that person's organization. The organization may only grant access to people working under its control, i.e. its own members, consultants to the organization, or individuals providing service to the organization. All Individual Application Forms must be signed by a person authorized by your organization for such signatures. The individual's form must be kept by the organization for any persons being involved at its site.

Suggestions

Suggestions will consist of a ranked list of up to 50 attractions you think the user is interested in based upon the provided context and user profile. You will provide the attraction IDs from the collection described previously.

Phase 1 Experiment

The Phase 1 Experiment is a collection based task similar to the TREC 2015 Contextual Suggestion Track's Live Experiment.  The main change is that we don't require you set up and register a live server, but will distribute a set of profiles and contexts and collect responses in a batch wise fashion, as was used in the track until 2014.

Phase 1 Suggestions

The Phase 1 suggestions will consist of a ranked list of up to 50 attractions you think the user is interested in based upon the provided context and user profile. You will provide the attraction IDs from the TREC CS Web Corpus released as an additional data. 

The Phase1 response file (Phase 1 submission) has to contain a list of JSON objects separated by newlines. Each response has to have the response.json format. Specifically, each line of the Phase 1 response file has an id, which is the request ID available in the Phase1_requests.json, and an ordered sequence of suggestion IDs. You should also include your team and run ID in the JSON responses. In order to ensure that you have a valid submission, you can use the following command:

python Phase1_validate.py Phase1_requests.json Phase1_submission

Phase 2 Experiment

The phase 2 experiment is a reranking task similar to the TREC 2015 Contextual Suggestion Track's Batch Experiment.  A file with a set of suggestions requests that were made during the phase 1 experiment will be made available. You will provide responses in the same format as responses in the Phase 1 experiment. The attractions suggested during the phase 2 experiment must have been suggested during the phase 1 experiment.  In addition to the suggestion requests, the attraction IDs returned by participants in the Phase 1 experiment will be provided.

Judging

Suggestions made during the phase experiment will be judged by crowdsourced users who supply both the user profile and final judgments.

If you want to see how the assessors will view results you can render suggestion responses using this tools: https://github.com/akdh/cst-tools/tree/master#render-files

Evaluation

NDCG@5 (main), P@5, and MRR are the three metrics that will be used.