Data

This page contains data from the shared task papers collected using Mechanical Turk. Each data archive should contain a readme describing its contents. For details on data set creation, see the associated workshop paper.

If you use this data in your work, please cite the corresponding workshop paper.

For an overview of the shared task and the workshop, see:

Chris Callison-Burch, Mark Dredze. Creating Speech and Language Data With Amazon's Mechanical Turk. Workshop on Creating Speech and Language Data With Mechanical Turk at NAACL-HLT, 2010.

Clustering dictionary definitions using Amazon Mechanical Turk

Gabriel Parent and Maxine Eskenazi

[Data]

Semi-supervised Word Alignment with Mechanical Turk

Qin Gao and Stephan Vogel

[Data]

Rating Computer-Generated Questions with Mechanical Turk

Michael Heilman and Noah A. Smith

[Data]

Opinion Mining of Spanish Customer Comments with Non-Expert Annotations on Mechanical Turk

Bart Mellebeek, Francesc Benavent, Jens Grivolla, Joan Codina, Marta R. Costa-Jussà and Rafael Banchs

[Data]

Using Mechanical Turk to Annotate Lexicons for Less Commonly Used Languages

Ann Irvine and Alexandre Klementiev

[Data]

MTurk Crowdsourcing: A Viable Method for Rapid Discovery of Arabic Nicknames?

Chiara Higgins, Elizabeth McGrath and Laila Moretto

[Data]

An Enriched MT Grammar for Under \$100

Omar Zaidan and Juri Ganitkevitch

[Data]

Evaluation of Commonsense Knowledge with Mechanical Turk

Jonathan Gordon, Benjamin Van Durme and Lenhart Schubert

[Data]

Cheap Facts and Counter-Facts

Rui Wang and Chris Callison-Burch

[Data]

The Wisdom of the Crowd’s Ear: Speech Accent Rating and Annotation with Amazon Mechanical Turk

Stephen Kunath and Steven Weinberger

[Data]

Crowdsourcing Document Relevance Assessment with Mechanical Turk

Catherine Grady and Matthew Lease

[Data]

Preliminary Experiments with Amazon's Mechanical Turk for Annotating Medical Named Entities

Meliha Yetisgen-Yildiz, Imre Solti, Fei Xia and Scott Halgrim

[Data]

Measuring Transitivity Using Untrained Annotators

Nitin Madnani, Jordan Boyd-Graber and Philip Resnik

[Data]

Non-Expert Evaluation of Summarization Systems is Risky

Dan Gillick and Yang Liu

[Data]

Non-Expert Correction of Automatically Generated Relation Annotations

Matthew R. Gormley, Adam Gerber, Mary Harper and Mark Dredze

[Data]

Using the Amazon Mechanical Turk to Transcribe and Annotate Meeting Speech for Extractive Summarization

Matthew Marge, Satanjeev Banerjee and Alexander Rudnicky

[Data]

Creating a Bi-lingual Entailment Corpus through Translations with Mechanical Turk: \$100 for a 10-day Rush

Matteo Negri and Yashar Mehdad

[Data]

Error Driven Paraphrase Annotation using Mechanical Turk

Olivia Buzek, Philip Resnik and Ben Bederson

[Data]

Shedding (a Thousand Points of) Light on Biased Language

Tae Yano, Philip Resnik and Noah A. Smith

[Data]

Corpus Creation for New Genres: A Crowdsourced Approach to PP Attachment

Mukund Jha, Jacob Andreas, Kapil Thadani, Sara Rosenthal and Kathleen McKeown

[Data]

Annotating Named Entities in Twitter Data with Crowdsourcing

Tim Finin, Will Murnane, Anand Karandikar, Nicholar Keller, Justin Martineau and Mark Dredze

[Data]