Ph.D. in Computer Science (UNED) assistant professor at the Departamento de Lenguajes y Sistemas Informaticos de la UNED. Member of the UNED group in Natural Language Processing and Information Retrieval.
I'm currently Assisstant Professor with Universidad Nacional de Educación a Distancia (UNED), Spain, and member of the Research Group in natural language processing and information retrieval. My interests so far have focused on two lines of research. Firstly, the axiomatisation of evaluation metrics on the basis of measurement theory, making contributions in classification, ranking, diversity, clustering, automatic summarisation, etc. Secondly, the extension of information theory over continuous spaces, making contributions in document representation, similarity metrics, unsupervised fusion of rankings and distributional compositional models.
(I) Evaluation metrics:
My studies on evaluation metrics for multiple IA tasks are grounded on axiomatic methodologies and measurement theory. The most relevant contributions in this line are:
On the nature of information access evaluation metrics: a unifying framework. (E. Amigó and S. Mizzaro, IRJ 2020): This paper provides a uniform, general formal account of evaluation metrics for ranking, classification, clustering, and other information access problems. The approach extends Measurement Theory, modelling the notion of mesasurement closeness at different scales.
An Effectiveness Metric for Ordinal Classification: Formal Properties and Experimental Results. (E. Amigó, J. Gonzalo, S. Mizzaro and J. Carrillo-de-Albornoz, ACL 2020) We propose a new metric for Ordinal Classification, Closeness Evaluation Measure (CEM), that is rooted on Measurement Theory and Information Theory. The results indicate that the proposed metric captures quality aspects from different traditional tasks simultaneously.
A comparison of filtering evaluation metrics based on formal constraints (E. Amigó, J. Gonzalo, F. Verdejo and D. Spina, IRJ 2019). A study which leads to a typology of measures for Document Filtering which is based on a set of three (mutually exclusive) formal properties which help to understand the fundamental differences between measures and determining which ones are more appropriate depending on the application scenario.
An Axiomatic Analysis of Diversity Evaluation Metrics: Introducing the Rank-Biased Utility Metric (E. Amigó and D. Spina and J. Carrillo-de-Albornoz, SIGIR 2018). We define a constraint-based axiomatic framework to study the suitability of existing diversity evaluation metrics. The analysis informed the definition of Rank-Biased Utility (RBU). Our experiments show that the proposed metric captures quality criteria reflected by different metrics, being suitable in the absence of knowledge about particular features of the scenario under study.
A general evaluation measure for document organization tasks. (E. Amigó, J. Gonzalo, F. Verdejo, SIGIR 2013) A set of five axioms for IR evaluation metrics, and the definition of Reliability and Sensitivity: a metric that can be applied to any mixture of ranking, clustering and filtering tasks. A high score according to the harmonic mean of Reliability and Sensitivity ensures a high score with any of the most popular evaluation metrics in all the Document Retrieval, Clustering and Filtering datasets used in our experiments.
Combining Evaluation Metrics via the Unanimous Improvement Ratio and its Application to Clustering Tasks. (E. Amigó, J. Gonzalo, J. Artiles, F. Verdejo, JAIR 2011). A measure based on Conjoint Measurement Theory that indicates how robust the measured differences are to changes in the relative weights of the individual metrics (e.g. Precision and Recall) in metric combination functions. The empirical results confirm the validity and usefulness of the metric for the Text Clustering problem.
An Evaluation Framework for Aggregated Temporal Information Extraction (Enrique Amigó, Javier Artiles and Heng Ji SIGIR 2011) This paper focusses on the representation and evaluation of temporal information about a certain event or entity. Given that the resulting temporal information can be vague, it is necessary that an evaluation framework captures and compares the temporal uncertainty of system outputs and human assessed gold-standard data. In this paper, we define a temporal representation, formal constraints and an evaluation metric. The task setting and the evaluation measure presented here have been introduced in the TAC 2011 Knowledge Base Population evaluation for the Temporal Slot Filling task.
A comparison of extrinsic clustering evaluation metrics based on formal constraints. (E. Amigó, J. Gonzalo, J. Artiles, F. Verdejo, IRJ 2009) A few intuitive formal constraints on clustering metrics which shed light on which aspects of the quality of a clustering are captured by different metric families. Our analysis of a wide range of metrics shows that only BCubed satisfies all formal constraints. We also extend the analysis and the BCubed metric to overlapping clustering.
(II) Observational Information Theory
In this research line, we have defined a generalization of the Shannon's information content for continuous feature values called Observational Information Quantity (OIQ). The following papers describes its implications in document representation, heterogeneous feature aggregation, similarity axiomatics, ranking effectiveness and ranking fusion.
A Formal Account of Effectiveness Evaluation and Ranking Fusion. (E. Amigó, F. Giner, S. Mizzaro, D. Spina, ICTIR 2018) In this papaer the observational information framework is used to formalize: (i) system effectiveness as an information theoretic similarity between system outputs and human assessments, and (ii) ranking fusion as an information quantity measure. As a result, the proposed effectiveness metric improves popular metrics in terms of formal constraints. In addition, our empirical experiments suggest that it captures quality aspects from traditional metrics, while the reverse is not true. Our work also advances the understanding of theoretical foundations of the empirically known phenomenon of effectiveness increase when combining retrieval system outputs in an unsupervised manner.
Integrating learned and explicit document features for reputation monitoring in social media. (F. Giner, E. Amigó, F. Verdejo, KAIS 2020) In this paper, we define the OIQ based representation and its application on quantitative and discrete features aggregation in the context of on-line reputation management. The approach allows to integrate, without supervision, intrinsic features (words, n.grams) with quantitative features based on training data (proximity to classes or clusters).
On the foundations of similarity in information access. (E. Amigó, J. Gonzalo, F. Giner and F. Verdejo, IRJ 2019) In this paper we show how axiomatic explanations of similarity from other fields do not completely fit the notion of similarity function in Information Access problems. On the basis of observational information framework, we propose a new set of formal constraints for similarity functions. Based on these formal constraints, we introduce a new parameterized similarity function, the information contrast model (ICM), which generalizes both pointwise mutual information and Tversky’s linear contrast model. Unlike previous similarity models, ICM satisfies the proposed formal constraints for a certain range of values of its parameters.
I have combined my research career with musical projects.
Get in touch at [firstname.lastname@example.org]