TREC 2015 Contextual Suggestion Track Guidelines
The Contextual Suggestion Track investigates search techniques for complex information needs that are highly dependent on context and user interests.
Background
According to a report from the The Second Strategic Workshop on Information Retrieval in Lorne (published in the SIGIR Forum, June 2012): “Future information retrieval systems must anticipate user needs and respond with information appropriate to the current context without the user having to enter an explicit query... In a mobile context such a system might take the form of an app that recommends interesting places and activities based on the user’s location, personal preferences, past history, and environmental factors such as weather and time... In contrast to many traditional recommender systems, these systems must be open domain, ideally able to make suggestion and synthesize information from multiple sources...”
For example, imagine a group of information retrieval researchers with a November evening to spend in beautiful Gaithersburg, Maryland. A contextual suggestion system might recommend a beer at the Dogfish Head Alehouse (www.dogfishalehouse.com), dinner at the Flaming Pit (www.flamingpitrestaurant.com), or even a trip into Washington on the metro to see the National Mall (www.nps.gov/nacc). The goal of the Contextual Suggestion track is to provide a venue for the evaluation of such systems.
Timeline
Contexts available: January 12
Pre-TREC task deadline: March 30
Collection released: May 15
Guidelines finalized: May 29
Live experiment start testing: June 8
Live experiment start : June 22
Scoring period start: July 3
Live experiment end: July 17
Batch experiment profiles released: July 20
Batch experiment result deadline: August 7 August 23
Results available: September
Task Summary
For this task participants are asked to develop a system that is able to make suggestions for a particular person (based upon their profile) with a particular context. Details of what the profiles and contexts contain are given below. Each profile corresponds to a single user, and indicates that user’s preference with respect to each example suggestion. For example, one suggestion might be to have a beer at the Dogfish Head Alehouse, and the profile might include a negative preference with respect to this suggestion. Each training suggestion includes a title, description, and an associated URL. Each context corresponds to a particular geographical location (a city). For example, the context might be Gaithersburg, Maryland.
For each profile/context pairing, participants should return a ranked list of up to 50 ranked suggestions. Each suggestion should be appropriate to the profile (based on the user’s preferences) and the context (according to the location). Profiles correspond to the stated preferences of real individuals, who will judge the proposed suggestions. Users are primarily recruited through crowdsourcing sites.
Participants will return results either by setting up a server and participating in the live experiment or by submitting during the batch experiment.
Contexts
Each context will consist of a mandatory city name which represents which city the trip will occur in and several pieces of optional data about the trip.
A city the user is located in, which consists of:
An ID
A city - The name of the city
A state - The name of the US state the city is in
A latitude and longitude - These are available for convenience and do not represent the exact user location but are analogous to the city name.
A trip type (optionally), which is one of:
Business
Holiday
Other
A trip duration (optionally), which is one of:
Night out
Day trip
Weeked trip
Longer
The type of group the person is travelling with (optionally), which is one of:
Travelling alone (Alone)
Travelling with a group of friends (Friends)
Travelling with family (Family)
Travelling with an other group (Other)
The season the trip will occur in (optionally), which is one of:
Winter
Summer
Autumn
Spring
Profiles
Profiles consist of a list of attractions the user has previously rated. For each attraction the profile will include:
A rating:
4: Strongly interested
3: Interested
2: Neither interested or uninterested
1: Uninterested
0: Strongly uninterested
-1: No rating given
Additionally the user may annotate the attraction with tags that indicate why the user likes the particular attraction. For example the user may add the tag "Shopping" to an attraction. You can see the full set of possible tags here: https://github.com/akdh/cst-tools/blob/master/tags.csv
The profile may optionally also include the user's age and gender.
Note that a profile may contain no attractions if the user has never rated anything.
Collection
The collection consists of a set of attractions. For each attraction there is:
An attraction ID, which contains three parts separated by dashes (-)
The string TRECCS
An 8 digit number
A three digit number corresponding to that attraction's context ID
A context (city) ID which indicates which city this attraction is in
A URL with more information about the attraction
A title
The collection is available for download here: http://plg.uwaterloo.ca/~adeanhal/trec2015/collection_2015.csv.gz
Duplicate documents: Note that there are duplicate documents in the collection. In these cases you may return any of the attraction IDs as long as it is valid for your request (it is in the right context).
Suggestions
Suggestions will consist of a ranked list of up to 50 attractions you think the user is interested in based upon the provided context and user profile. You will provide the attraction IDs from the collection described previously.
(Approximately) Live Experiment
For the live experiment you will have to set up and register a server that can respond to suggestion requests. Each suggestion request will include a profile and a context and must return a list of attraction IDs in the collection.
An example of a suggestion request and response, how to send requests to your server, the JSON schema used for responses, and the validation script used for responses is available here: https://gist.github.com/akdh/564cbb5f9c6c92c99d8a.
An example server built in Python and the JSON schema used for requests is available here: https://gist.github.com/akdh/65c6391eb85b553fcf12.
Once you have setup your service and are ready to start receiving requests you can register it here: https://docs.google.com/forms/d/1a1kxj_EPvyOmB3Mx7YK3wOz1UQ_rUx8sbhuMTWGxR00/viewform?usp=send_form#start=invite
Batch Experiment
A file with a set of suggestions requests that were made during the live experiment will be made available. You will provide responses in the same format as responses in the live experiment. The attractions suggested during the batch experiment must have been suggested during the live experiment. In addition to the suggestion requests the attraction IDs returned by participants in the live experiment will be provided.
Judging
Suggestions made during the live experiment will be judged by crowdsourced users who supply both the user profile and final judgments.
If you want to see how the assessors will view results you can render suggestion responses using this tools: https://github.com/akdh/cst-tools/tree/master#render-files
Evaluation
P@5 (main), MRR, and a modified version of TBG (used in previous years) are the three metrics that will be used.