This task, organised in 2008 and 2009, was a combination of TUNA-AS and TUNA-R, in that it required participants to develop a method which, given an input DOMAIN, outputs an identifying description for the target referent, that is, a WORD-STRING (thus, such systems would need to select the ATTRIBUTEs and realise them, though not necessarily in two separate steps).
The training and development data consisted of the same TUNA corpus instances used for TUNA-AS and TUNA-R, and the test set was the same as for TUNA-R.
The same software used for TUNA-AS and TUNA-R was provided for this task. For WORD-STRINGs that are the output of the TUNA-R and TUNA-REG tasks, the software compared human and peer outputs on the basis of (i) string edit (Levenshtein) distance; (ii) string accuracy, that is, the proportion of peer output strings that were identical to human-produced descriptions.
Other metrics: As in TUNA-R, we also computed BLEU and NIST scores to compare peer and human outputs. We also ran a lab-based experiment with human participants in which peer and human corpus outputs were compared in terms of (i) the time it took to read the description; (ii) the time it took to identify the target referent given the description; (iii) the identification error rate. In addition, we ran an additional experiment in which system-generated and human-produced outputs were judged by linguists for their fluency and adequacy, as described in (Gatt and Belz, 2010).
All software can be found in the TUNA-REG-09 participants' pack: [TUNA-REG'09 pack]. Descriptions of the human evaluation methods can be found in the TUNA'08 and TUNA'09 results reports (see below).
Detailed documentation for the TUNA-REG shared task can be found in the TUNA-REG'09 participants' pack: [TUNA-REG'09 pack].
W08-1131: Albert Gatt; Anja Belz; Eric Kow
The TUNA Challenge 2008: Overview and Evaluation Results
W08-1132: Bernd Bohnet
The Fingerprint of Human Referring Expressions and their Surface Realization with Graph Transducers (IS-FP, IS-GT, IS-FP-GT)
W08-1133: Giuseppe Di Fabbrizio; Amanda J. Stent; Srinivas Bangalore
Referring Expression Generation Using Speaker-based Attribute Selection and Trainable Realization (ATTR)
W08-1134: Pablo Gervás; Raquel Hervás; Carlos León
NIL-UCM: Most-Frequent-Value-First Attribute Selection and Best-Scoring-Choice Realization
W08-1136: John D. Kelleher; Brian Mac Namee
Referring Expression Generation Challenge 2008 DIT System Descriptions (DIT-FBI, DIT-TVAS, DIT-CBSR, DIT-RBR, DIT-FBI-CBSR, DIT-TVAS-RBR)
W08-1137: Josh King
OSU-GP: Attribute Selection Using Genetic Programming
W08-1138: Emiel Krahmer; Mariët Theune; Jette Viethen; Iris Hendrickx
GRAPH: The Costs of Redundancy in Referring Expressions
W09-0629 [bib]: Albert Gatt; Anja Belz; Eric Kow
The TUNA-REG Challenge 2009: Overview and Evaluation Results
W09-0630 [bib]: Ivo Brugman; Mariët Theune; Emiel Krahmer; Jette Viethen
Realizing the Costs: Template-Based Surface Realisation in the GRAPH Approach to Referring Expression Generation
Generation of Referring Expression with an Individual Imprint
W09-0632 [bib]: Raquel Hervás; Pablo Gervás
Evolutionary and Case-Based Approaches to REG: NIL-UCM-EvoTAP, NIL-UCM-ValuesCBR and NIL-UCM-EvoCBR
W09-0633 [bib]: Diego Jesus de Lucena; Ivandré Paraboni
USP-EACH: Improved Frequency-based Greedy Attribute Selection
W09-0634 [bib]: Kotaro Funakoshi; Philipp Spanger; Mikio Nakano; Takenobu Tokunaga
A Probabilistic Model of Referring Expressions for Complex Objects