Shared Task Challenge

Overview: The goals of CrowdScale 2013's Shared Task Challenge are to:

  • Promote community engagement and collective effort on a larger-scale task of interest
  • Enable comparable evaluation (data, task, and metrics) across participants, enabling more fine-grained comparative assessment of specific methods considered
  • Assess overall state-of-the-art performance in the field
Task. Multi-class classification: generate the best possible answer for each question, based on the judgments of five or more raters per question 

Prize money: Google is generously providing $1500 in prize funds to be awarded to the top-scoring participants. NOTE: to be eligible for prize money, you must 1) submit a top-scoring submission; 2) submit a paper by the deadline describing your system; *AND* 3) register for and attend the workshop. 

A single $750 prize will be awarded to the top-scoring submission on each dataset (CrowdFlower's and Google's). In the case of ties, prize money will be divided evenly among those tied.

Datasets. To help advance research on crowdsourcing at scale, CrowdFlower and Google are providing two new challenge datasets:

Data Format

"basic data" consists
simply of three comma-separated value (CSV) columns:
  1. question: the ID of the question/example to be answered
  2. rater: the ID of the rater providing an answer
  3. judgment: the rater's answer

"full data" provides more details about the regarding the structure and content of the questions, keyed by the same question ID. Note that the full data also includes the basic data; we have separated them above just to make it easier for people to get started. While one can participate completely in the shared task using only the basic data, we encourage participants to use additional information in the full data to achieve higher quality. 

"ground truth" provides the correct answers for a small, random sample of the questions. Participants are encouraged to tune their methods on this data before submitting their final answers. Ground truth questions will not be used in the final evaluation.

Evaluation Metric: answer quality will be scored by average recall (over class categories) for awarding prize money.  Additional metrics (e.g., simple accuracy) will be reported for analysis but will not affect prize money distribution.  For transparency, download our Matlab evaluation script (and let us know if you have any comments or corrections).

Submitting Results: result submissions consist simply of two CSV columns: 

  1. question: the ID of the question/example to be answered
  2. answer: the best answer for the question

Each participating group may submit a single submission for official evaluation. Those collaborating may only submit once together, and not make separate submissions individually. The result file for each task (sentiment analysis and fact evaluation) should be compressed (.zip, .gz, or .bz2) and emailed to the organizers.

Submitting papers: Participants are expected to submit a paper describing their methods and *preliminary* results based on the released "ground truth" sample. Final results will be announced at the workshop. See the Call for Papers for additional details on paper format.

Important Dates
  • October 20: Result submissions due
  • October 27: Papers due
  • November 1: Papers posted on website
Software. Participants are not required to submit or disclose software to participate. Participants are welcome to use or extend any pre-existing software in preparing their submissions, such as:

Questions? Contact the organizers.

Related Shared Tasks
Matt Lease,
Aug 29, 2013, 5:28 PM
Matt Lease,
Aug 29, 2013, 5:30 PM
Matt Lease,
Aug 28, 2013, 7:19 AM
Matt Lease,
Aug 28, 2013, 6:21 AM
Matt Lease,
Sep 25, 2013, 11:55 AM
Matt Lease,
Aug 28, 2013, 7:31 AM