TUNA-REG

Task definition

This task, organised in 2008 and 2009, was a combination of TUNA-AS and TUNA-R, in that it required participants to develop a method which, given an input DOMAIN, outputs an identifying description for the target referent, that is, a WORD-STRING (thus, such systems would need to select the ATTRIBUTEs and realise them, though not necessarily in two separate steps).

Input and output data

The training and development data consisted of the same TUNA corpus instances used for TUNA-AS and TUNA-R, and the test set was the same as for TUNA-R.

1. [TUNA-REG training/development data];
2. [TUNA-REG test data - inputs only (56 inputs * 2 domains)]; [TUNA-REG test data - inputs + outputs only (56 inputs * 2 instances * 2 domains)]

Evaluation

The same software used for TUNA-AS and TUNA-R was provided for this task. For WORD-STRINGs that are the output of the TUNA-R and TUNA-REG tasks, the software compared human and peer outputs on the basis of (i) string edit (Levenshtein) distance; (ii) string accuracy, that is, the proportion of peer output strings that were identical to human-produced descriptions.

Other metrics: As in TUNA-R, we also computed BLEU and NIST scores to compare peer and human outputs. We also ran a lab-based experiment with human participants in which peer and human corpus outputs were compared in terms of (i) the time it took to read the description; (ii) the time it took to identify the target referent given the description; (iii) the identification error rate. In addition, we ran an additional experiment in which system-generated and human-produced outputs were judged by linguists for their fluency and adequacy, as described in (Gatt and Belz, 2010).

All software can be found in the TUNA-REG-09 participants' pack: [TUNA-REG'09 pack]. Descriptions of the human evaluation methods can be found in the TUNA'08 and TUNA'09 results reports (see below).

Documentation

Detailed documentation for the TUNA-REG shared task can be found in the TUNA-REG'09 participants' pack: [TUNA-REG'09 pack].

Previous results

W08-1131: Albert Gatt; Anja Belz; Eric Kow

The TUNA Challenge 2008: Overview and Evaluation Results

W08-1132: Bernd Bohnet

The Fingerprint of Human Referring Expressions and their Surface Realization with Graph Transducers (IS-FP, IS-GT, IS-FP-GT)

W08-1133: Giuseppe Di Fabbrizio; Amanda J. Stent; Srinivas Bangalore

Referring Expression Generation Using Speaker-based Attribute Selection and Trainable Realization (ATTR)

W08-1134: Pablo Gervás; Raquel Hervás; Carlos León

NIL-UCM: Most-Frequent-Value-First Attribute Selection and Best-Scoring-Choice Realization

W08-1136: John D. Kelleher; Brian Mac Namee

Referring Expression Generation Challenge 2008 DIT System Descriptions (DIT-FBI, DIT-TVAS, DIT-CBSR, DIT-RBR, DIT-FBI-CBSR, DIT-TVAS-RBR)

W08-1137: Josh King

OSU-GP: Attribute Selection Using Genetic Programming

W08-1138: Emiel Krahmer; Mariët Theune; Jette Viethen; Iris Hendrickx

GRAPH: The Costs of Redundancy in Referring Expressions

W09-0629 [bib]: Albert Gatt; Anja Belz; Eric Kow

The TUNA-REG Challenge 2009: Overview and Evaluation Results

W09-0630 [bib]: Ivo Brugman; Mariët Theune; Emiel Krahmer; Jette Viethen

Realizing the Costs: Template-Based Surface Realisation in the GRAPH Approach to Referring Expression Generation

W09-0631 [bib]: Bernd Bohnet

Generation of Referring Expression with an Individual Imprint

W09-0632 [bib]: Raquel Hervás; Pablo Gervás

Evolutionary and Case-Based Approaches to REG: NIL-UCM-EvoTAP, NIL-UCM-ValuesCBR and NIL-UCM-EvoCBR

W09-0633 [bib]: Diego Jesus de Lucena; Ivandré Paraboni

USP-EACH: Improved Frequency-based Greedy Attribute Selection

W09-0634 [bib]: Kotaro Funakoshi; Philipp Spanger; Mikio Nakano; Takenobu Tokunaga

A Probabilistic Model of Referring Expressions for Complex Objects

Page updated

Google Sites

Report abuse