TUNA-AS

Task definition

This shared task, organised in 2007 and 2008, focused on selecting the content for the description of a target entity; it required peers to develop a method that takes as input a DOMAIN, in which one ENTITY was the target, and return an ATTRIBUTE-SET consisting of a subset of the target attributes that distinguishes it from the other objects in the domain.

Input and output data

The TUNA-AS shared task data was derived from the TUNA Corpus, representations were modified and only singular descriptions were included. The data was randomly divided into 60% training, 20% development and 20% test datasets, each set containing exemplars from both the furniture and people domain.

1. [TUNA-AS'07 training/development data]; [TUNA-AS'07 test data - inputs]; [TUNA-AS'07 test data - inputs + outputs]
2. [TUNA-AS'08 training/development data]; [TUNA-AS'08 test data - inputs (56 inputs for each domain)]; [TUNA-AS'08 test data - inputs + outputs (56 inputs * 2 outputs for each domain)]

Evaluation

For this task, we developed a java program that, given a corpus of TUNA-AS instances, computes

1. Two coefficients, Dice and MASI (Passonneau, 2006) that assess the degree of overlap between ATTRIBUTE-SETs generated by a peer system with those produced by a human;
2. Accuracy, the proportion of peer outputs that were identical to model human outputs; and
3. Minimality, the proportion of peer outputs which contain no more attributes than necessary to distinguish the target (Dale and Reiter, 1995).

Other metrics: For this task, we also ran a laboratory-based experiment in which both peer and human ATTRIBUTE-SETs (realised as strings using purpose-built software) were compared in terms of the time it took for human subjects to read the descriptions and identify the entity being described. This measure in fact consists of (i) the human reaction times and (ii) the proportion of identification errors made by the subjects in the experiments.

All software can be found in the TUNA'08 participants' pack: [TUNA'08 pack]. Descriptions of the human evaluation methods can be found in the TUNA-AS'07 and TUNA'08 results reports (see below).

Documentation

Detailed documentation for the TUNA-AS shared task can be found in the TUNA'08 participants' pack: [TUNA'08 pack].

Previous results

2007 Shared Task results report and participants' reports can be found in the [UCNLG+MT Proceedings].

The following papers are from the 2008 edition of the Shared Task:

W08-1131: Albert Gatt; Anja Belz; Eric Kow

The TUNA Challenge 2008: Overview and Evaluation Results

W08-1132: Bernd Bohnet

The Fingerprint of Human Referring Expressions and their Surface Realization with Graph Transducers (IS-FP, IS-GT, IS-FP-GT)}

W08-1133: Giuseppe Di Fabbrizio; Amanda J. Stent; Srinivas Bangalore

Referring Expression Generation Using Speaker-based Attribute Selection and Trainable Realization (ATTR)

W08-1134: Pablo Gervás; Raquel Hervás; Carlos León

NIL-UCM: Most-Frequent-Value-First Attribute Selection and Best-Scoring-Choice Realization

W08-1135: Diego Jesus de Lucena; Ivandré Paraboni

USP-EACH Frequency-based Greedy Attribute Selection for Referring Expressions Generation

W08-1136: John D. Kelleher; Brian Mac Namee

Referring Expression Generation Challenge 2008 DIT System Descriptions (DIT-FBI, DIT-TVAS, DIT-CBSR, DIT-RBR, DIT-FBI-CBSR, DIT-TVAS-RBR)

W08-1137: Josh King

OSU-GP: Attribute Selection Using Genetic Programming

W08-1138: Emiel Krahmer; Mariët Theune; Jette Viethen; Iris Hendrickx

GRAPH: The Costs of Redundancy in Referring Expressions

W08-1139: Sibabrata Paladhi; Sivaji Bandyopadhyay

JU-PTBSGRE: GRE Using Prefix Tree Based Structure

Page updated

Google Sites

Report abuse