Enrique Amigó Cabrera

Bio

Ph.D. in Computer Science (UNED) Profesor Contratado Doctor (assistant professor) at the Departamento de Lenguajes y Sistemas Informaticos de la UNED (UNED). Member of the UNED group in Natural Language Processing and Information Retrieval.

Phone Number:

+34) 91 398 8651

Postal Address:

E.T.S.I. Informática de la UNED

c/ Juan del Rosal, 16 (Ciudad Universitaria)

28040 Madrid

Office

2.07

RESEARCH INTERESTS

My main research interest is Textual Information Access and and evaluation methodologies. Relevant topics include document retrieval, topic detection, clustering, classification, summarization and translation.

SOFTWARE

Heterogeneity Based Ranking:

The heterogeneity property of text evaluation measures states that the probability of a real (i.e. human assessed) similarity increase is directly related to the heterogeneity of the set of automatic similarity measures that corroborate such increase. This script implements a method for similarity measures that is based on the heterogeneity principle. The method is completely unsupervised (it does not use any kind of human assessments on the quality of the measures to be combined) and leads to top performing combined similarity measures in multiple tasks like Document Clustering, Textual Entailment, Semantic Text Similarity, and automatic MT and Summarization.

Unanimous Improvement Ratio:

Many Artificial Intelligence tasks cannot be evaluated with a single quality criterion and some sort of weighted combination is needed to provide system rankings. A problem of weighted combination measures is that slight changes in the relative weights may produce substantial changes in the system rankings. This software implements the Unanimous Improvement Ratio (UIR), a measure that complements standard metric combination criteria (such as van Rijsbergen's F-measure) and indicates how robust the measured differences are to changes in the relative weights of the individual metrics. UIR is meant to elucidate whether a perceived difference between two systems is an artifact of how individual metrics are weighted.

Reliability and Sensitivity (extended BCubed)

Some key Information Access tasks -- Document Retrieval, Clustering, Filtering, etc. -- can be seen as instances of a generic "document organization" problem that establishes priority and relatedness relationships between documents. In this paper we propose two complementary evaluation measures -- Reliability and Sensitivity -- for the generic Document Organization task which are derived from a proposed set of formal constraints (properties that any suitable measure must satisfy).

For each of the tasks subsumed under the document organization problem, Reliability and Sensitivity satisfy more formal constraints than previously existing evaluation metrics. Their most characteristic feature, in addition, is their strictness: in order to reach high Reliability and Sensitivity values, a system must also achieve high values with all standard evaluation measures.