guidelines

TREC 2012 Contextual Suggestion Track

The Contextual Suggestion Track investigates search techniques for complex information needs that are highly dependent on context and user interests.

Track Organizers:

Track Web Page:  http://sites.google.com/site/treccontext/

Mailing list: Send a mail message to listproc (at) nist.gov such that the body consists of the line subscribe trec-context <FirstName> <LastName>

Background

According to a report from the The Second Strategic Workshop on Information Retrieval in Lorne (submitted to SIGIR Forum, 2012): “Future information retrieval systems must anticipate user needs and respond with information appropriate to the current context without the user having to enter an explicit query... In a mobile context such a system might take the form of an app that recommends interesting places and activities based on the user’s location, personal preferences, past history, and environmental factors such as weather and time...  In contrast to many traditional recommender systems, these systems must be open domain, ideally able to make suggestion and synthesize information from multiple sources...”

For example, imagine a group of information retrieval researchers with a November evening to spend in beautiful Gaithersburg, Maryland.  A contextual suggestion system might recommend a beer at the Dogfish Head Alehouse (www.dogfishalehouse.com), dinner at the Flaming Pit (www.flamingpitrestaurant.com), or even a trip into Washington on the metro to see the National Mall (www.nps.gov/nacc).  The goal of the Contextual Suggestion track is to provide a venue for the evaluation of such systems.

Task Summary

As input to the task, participants will be given a set of profiles, a set of example suggestions, and a set of contexts.  Details of all file formats are given in separate sections below.  Each profile corresponds to a single user, and indicates that user’s preference with respect to each example suggestion.  For example, one suggestion might be to have a beer at the Dogfish Head Alehouse, and the profile might include a negative preference with respect to this suggestion.  Each training suggestion includes a title, description, and an associated URL.  Each context corresponds to a particular geotemporal location, including city, day of the week, time of day, and season.  For example, the context might be Gaithersburg, Maryland, on a weekday evening in the fall.  Note that (at least for this year) we are keeping the geotemporal context very coarse-grained to help simplify the task.

For each profile/context pairing, participants should return a ranked list of 50 proposed suggestions.  Each suggestion should be appropriate to the profile (based on the user’s preferences) and the context (according to the geotemporal location). The description of the suggestion may be tailored to reflect the preferences of that user. Profiles correspond to the stated preferences of real individuals, who will return to judge proposed suggestions in the fall. For this year, all users are university undergraduate and graduate students in their twenties. For the purposes of this experiment, you can assume they are of legal drinking age at the location specified by the context.  Contexts are generated randomly.  You may assume that the user has up to five hours available to follow a suggestion and has access to appropriate transportation (e.g., a car).

Timeline

Example Suggestions

Up to 50 example suggestions will be distributed as a single file, formatted as follows:

  <example number=”1”>

    <title> Dogfish Head Alehouse </title>

    <description>Craft Brewed Ales and tasty wood grilled food</description>

    <url>http://www.dogfishalehouse.com/</url>

  </example>

  <example number=”2”>

    <title>The Flaming Pit</title>

    <description>

      The Flaming Pit Restaurant and Piano Lounge, home of Tyrone DeMonke.

    </description>

    <url>http://www.flamingpitrestaurant.com/</url>

  </example>

The title field provides a name for the suggestion, with the description providing additional information.  The URL is typically a homepage, or similar key page related to the suggestion, and may also be used to obtain more information about the suggestion.  For the TREC 2012 track, all examples suggestions will be taken from the Toronto, Canada, area.

Profiles

Profiles will be distributed as a single file, formatted as follows:

  <profile number=”1”>

    <example number=”1” initial=”1” final=”1”/>

    <example number=”2” initial=”0” final=”-1”/>

  </profile >

  <profile number=2>

    <example number=1 initial=”0” final=”1”/>

    <example number=2 initial=”-1” final=”1”/>

  </profile>

  etc.

Each profile corresponds to a single user.  For each example suggestion the profile indicates that user’s initial and final preference with respect to that example, where the initial preference is based on the title and description only, and the final preference is based on the content of the page and linked pages.  Preference values may be: -1 to indicate a negative preference, 1 to indicate a positive preference, or 0 to indicate either indifference or no response.

Contexts

Contexts will be distributed as a single file, formatted as follows:

  <context number=”1”>

    <city>Portland</city>

    <state>Oregon</state>

    <lat>45.5</lat>

    <long>-122.7</long>

    <day>weekday</day>

    <time>evening</time>

    <season>fall</season>

  </context>

  etc.

All contexts are cities or towns within the conterminous United States. The lat and long fields are for convenience and are intended to be synonymous with the city and state information.  The day field will be one of “weekday” or “weekend”.   The time field will be one of “morning”, “afternoon”, or “evening”, where morning can be interpreted as approximately 9:00 AM, afternoon can be interpreted as approximately 1:00 PM, and evening can be interpreted as approximately 6:00 PM.  The season field will be one of “spring”, “summer”, “winter”, or “fall”.  You may assume that the day is not a holiday or other special occasion.  As stated above, you may assume that the user has up to five hours available to follow a suggestion and has access to appropriate transportation.

Suggestions

Suggestions are returned as a single run file, formatted as follows:

<context2012 groupid=”waterloo” runid=”watcs12a”>

  <suggestion profile=”1” context=”1” rank=”1”>

    <title>Deschutes Brewery Portland Public House</title>

    <description>

      Deschutes Brewery’s distinct Northwest brew pub in Portland’s Pearl

      District has become a convivial gathering spot of beer and food lovers    

      since it’s 2008 opening.

    </description>

    <url>http://www.deschutesbrewery.com</url>

  </suggestion>

  etc.

</context2012>

The groupid should indicate your official TREC group id; the runid should be should be a unique identifier for your group and for the method used.  The file should contain 50 ranked suggestions for each combination of profile and context. Missing suggestions will be automatically assigned a preference judgement of -1, and a contextual judgement of 0.  Title fields may contain up to 64 characters from tag to tag, including whitespace.  Description fields may contain up to 512 characters from tag to tag.  Descriptions and titles may be tailored to the preferences of each user.  If two users are given the same suggestion, the descriptions and titles may be different.  The URL should identify the suggestion on the open web, and will be used to judge against both preferences and context.

Each group may submit up to two run files, which should use the same group id and different run ids.

Judging

Suggestions will be judged in terms of two separate criteria:

UPDATE on July 2, 2012: We will report separate judgments for geographical context and temporal context.  NIST assessors will independently judge each suggestion in terms of its appropriateness to place and time, and separate judgment values will be available for both of these factors.

Not all suggestions will be judged.  However, for a given profile/context combination we will either judge all runs to a uniform depth, or we will not judge that combination at all.

Evaluation Measures

Since suggestions are ranked, standard ranked retrieval measures will be applied, including precision@k and nDCG.  Other measures will be developed as part of the track.  Because preference and context are judged independently, we will accept experimental runs that focus exclusively on one criterion or the other.  If you wish to restrict your run to one set of judging criterion that option will be made available during the submission process.