This shared task, organised in 2007 and 2008, focused on selecting the content for the description of a target entity; it required peers to develop a method that takes as input a DOMAIN, in which one ENTITY was the target, and return an ATTRIBUTE-SET consisting of a subset of the target attributes that distinguishes it from the other objects in the domain.
The TUNA-AS shared task data was derived from the TUNA Corpus, representations were modified and only singular descriptions were included. The data was randomly divided into 60% training, 20% development and 20% test datasets, each set containing exemplars from both the furniture and people domain.
For this task, we developed a java program that, given a corpus of TUNA-AS instances, computes
Two coefficients, Dice and MASI (Passonneau, 2006) that assess the degree of overlap between ATTRIBUTE-SETs generated by a peer system with those produced by a human;
Accuracy, the proportion of peer outputs that were identical to model human outputs; and
Minimality, the proportion of peer outputs which contain no more attributes than necessary to distinguish the target (Dale and Reiter, 1995).
Other metrics: For this task, we also ran a laboratory-based experiment in which both peer and human ATTRIBUTE-SETs (realised as strings using purpose-built software) were compared in terms of the time it took for human subjects to read the descriptions and identify the entity being described. This measure in fact consists of (i) the human reaction times and (ii) the proportion of identification errors made by the subjects in the experiments.
All software can be found in the TUNA'08 participants' pack: [TUNA'08 pack]. Descriptions of the human evaluation methods can be found in the TUNA-AS'07 and TUNA'08 results reports (see below).
Detailed documentation for the TUNA-AS shared task can be found in the TUNA'08 participants' pack: [TUNA'08 pack].
2007 Shared Task results report and participants' reports can be found in the [UCNLG+MT Proceedings].
The following papers are from the 2008 edition of the Shared Task:
W08-1131: Albert Gatt; Anja Belz; Eric Kow
The TUNA Challenge 2008: Overview and Evaluation Results
W08-1132: Bernd Bohnet
The Fingerprint of Human Referring Expressions and their Surface Realization with Graph Transducers (IS-FP, IS-GT, IS-FP-GT)}
W08-1133: Giuseppe Di Fabbrizio; Amanda J. Stent; Srinivas Bangalore
Referring Expression Generation Using Speaker-based Attribute Selection and Trainable Realization (ATTR)
W08-1134: Pablo Gervás; Raquel Hervás; Carlos León
NIL-UCM: Most-Frequent-Value-First Attribute Selection and Best-Scoring-Choice Realization
W08-1135: Diego Jesus de Lucena; Ivandré Paraboni
USP-EACH Frequency-based Greedy Attribute Selection for Referring Expressions Generation
W08-1136: John D. Kelleher; Brian Mac Namee
Referring Expression Generation Challenge 2008 DIT System Descriptions (DIT-FBI, DIT-TVAS, DIT-CBSR, DIT-RBR, DIT-FBI-CBSR, DIT-TVAS-RBR)
W08-1137: Josh King
OSU-GP: Attribute Selection Using Genetic Programming
W08-1138: Emiel Krahmer; Mariët Theune; Jette Viethen; Iris Hendrickx
GRAPH: The Costs of Redundancy in Referring Expressions
W08-1139: Sibabrata Paladhi; Sivaji Bandyopadhyay
JU-PTBSGRE: GRE Using Prefix Tree Based Structure