GREC-NEG

Input and output data

The GREC’10 data is derived from the GREC-People corpus which (in its 2010 version) consists of 1,100 annotated introduction sections from Wikipedia articles in the category People, divided into training, development and test data. We first manually annotated people mentions in the GREC-People texts by marking up the RE word strings and annotating them with coreference information, semantic category, syntactic category and function, and various supplements and dependents. Annotations included nested references, plurals and coordinated REs, certain unnamed references and indefinites. For full details see the GREC’10 documentation (Belz, 2010). The manual annotations were then automatically checked and converted to XML format. The REF, REFEX and ALT-REFEX elements were the same as in the GREC-MSR annotations described above, except that here, all alternative REs are collected in a single list, appended at the end of the text, rather than to each reference. Also, here we allow arbitrary-depth embedding of references. The training, development and test data for the GREC-NEG task is exactly as described above. The test data inputs are identical, except that REF elements in the test data do not contain a selected REFEX element.

Task definition

The GREC-NEG Task is to select one REFEX from the ALT-REFEX list for each REF in each TEXT in the test sets, including any embedded REFs. The aim is to select REs which make the text fluent, clear and coherent.

Evaluation

We provide a software tool which computes the following metrics: (i) REG08-Type Precision is the proportion of REFEXs selected by a participating system which match the reference REFEXs; (ii) REG08-Type Recall is the proportion of target REFEXs for which a participating system has produced a match; (iii) String Accuracy is the proportion of word strings selected by a participating system that match those in the reference texts.

1. [geval package] which computes the above metrics

Documentation

[GREC-NEG'10 shared task documentation]

Previous results

W09-2817 [bib]: Anja Belz; Eric Kow; Jette Viethen

The GREC Named Entity Generation Challenge 2009: Overview and Evaluation Results

W09-2818 [bib]: Benoit Favre; Bernd Bohnet

ICSI-CRF: The Generation of References to the Main Subject and Named Entities Using Conditional Random Fields

W09-2821 [bib]: Charles Greenbacker; Kathleen McCoy

UDel: Extending Reference Generation to Multiple Entities

W09-2822 [bib]: Constatin Orasan; Iustin Dornescu

WLV: A Confidence-based Machine Learning Method for the GREC-NEG’09 Task

W10-4226: Anja Belz; Eric Kow

The GREC Challenges 2010: Overview and Evaluation Results

W10-4227: Guillaume Bouchard

Named Entity Generation Using Sampling-based Structured Prediction

W10-4229: Amitava Das; Tanik Saikh; Tapabrata Mondal; Sivaji Bandyopadhyay

JU_CSE_GREC10: Named Entity Generation at GREC 2010

W10-4230: Benoit Favre; Bernd Bohnet

The UMUS System for Named Entity Generation at GREC 2010

W10-4231: Charles Greenbacker; Nicole Sparks; Kathleen McCoy; Che-Yu Kuo

UDel: Refining a Method of Named Entity Generation

[PDF] Charles Greenbacker; Kathleen McCoy

Feature selection for reference generation as informed by psycholinguistic research. In Proceedings of the CogSci 2009 Workshop on Production of Referring Expressions (PRE-Cogsci'09).

Page updated

Google Sites

Report abuse