Data
This page contains data from the shared task papers collected using Mechanical Turk. Each data archive should contain a readme describing its contents. For details on data set creation, see the associated workshop paper.
If you use this data in your work, please cite the corresponding workshop paper.
For an overview of the shared task and the workshop, see:
Chris Callison-Burch, Mark Dredze. Creating Speech and Language Data With Amazon's Mechanical Turk. Workshop on Creating Speech and Language Data With Mechanical Turk at NAACL-HLT, 2010.
Clustering dictionary definitions using Amazon Mechanical Turk
Gabriel Parent and Maxine Eskenazi
[Data]
Semi-supervised Word Alignment with Mechanical Turk
Qin Gao and Stephan Vogel
[Data]
Rating Computer-Generated Questions with Mechanical Turk
Michael Heilman and Noah A. Smith
[Data]
Opinion Mining of Spanish Customer Comments with Non-Expert Annotations on Mechanical Turk
Bart Mellebeek, Francesc Benavent, Jens Grivolla, Joan Codina, Marta R. Costa-Jussà and Rafael Banchs
[Data]
Using Mechanical Turk to Annotate Lexicons for Less Commonly Used Languages
Ann Irvine and Alexandre Klementiev
[Data]
MTurk Crowdsourcing: A Viable Method for Rapid Discovery of Arabic Nicknames?
Chiara Higgins, Elizabeth McGrath and Laila Moretto
[Data]
An Enriched MT Grammar for Under \$100
Omar Zaidan and Juri Ganitkevitch
[Data]
Evaluation of Commonsense Knowledge with Mechanical Turk
Jonathan Gordon, Benjamin Van Durme and Lenhart Schubert
[Data]
Cheap Facts and Counter-Facts
Rui Wang and Chris Callison-Burch
[Data]
The Wisdom of the Crowd’s Ear: Speech Accent Rating and Annotation with Amazon Mechanical Turk
Stephen Kunath and Steven Weinberger
[Data]
Crowdsourcing Document Relevance Assessment with Mechanical Turk
Catherine Grady and Matthew Lease
[Data]
Preliminary Experiments with Amazon's Mechanical Turk for Annotating Medical Named Entities
Meliha Yetisgen-Yildiz, Imre Solti, Fei Xia and Scott Halgrim
[Data]
Measuring Transitivity Using Untrained Annotators
Nitin Madnani, Jordan Boyd-Graber and Philip Resnik
[Data]
Non-Expert Evaluation of Summarization Systems is Risky
Dan Gillick and Yang Liu
[Data]
Non-Expert Correction of Automatically Generated Relation Annotations
Matthew R. Gormley, Adam Gerber, Mary Harper and Mark Dredze
[Data]
Using the Amazon Mechanical Turk to Transcribe and Annotate Meeting Speech for Extractive Summarization
Matthew Marge, Satanjeev Banerjee and Alexander Rudnicky
[Data]
Creating a Bi-lingual Entailment Corpus through Translations with Mechanical Turk: \$100 for a 10-day Rush
Matteo Negri and Yashar Mehdad
[Data]
Error Driven Paraphrase Annotation using Mechanical Turk
Olivia Buzek, Philip Resnik and Ben Bederson
[Data]
Shedding (a Thousand Points of) Light on Biased Language
Tae Yano, Philip Resnik and Noah A. Smith
[Data]
Corpus Creation for New Genres: A Crowdsourced Approach to PP Attachment
Mukund Jha, Jacob Andreas, Kapil Thadani, Sara Rosenthal and Kathleen McKeown
[Data]
Annotating Named Entities in Twitter Data with Crowdsourcing
Tim Finin, Will Murnane, Anand Karandikar, Nicholar Keller, Justin Martineau and Mark Dredze
[Data]