About the project: CIPROM/2023/066 “Statistical learning methods based on archetypal analysis for complex and big data with applications (ARCHETYPE)”
Principal Researcher: Irene Epifanio López at Universitat Jaume I and ValgrAI.
Team: Guillermo Ayala Gallego (Universitat de València); Vicente J. Bolós Lacave (Universitat de València); Ismael Cabero Fayos (Universitat Jaume I); Juan Domingo Esteve (Universitat de València); M. Victoria Ibáñez Gual (Universitat Jaume I); Amelia Simó Vidal (Universitat Jaume I).
Postdoc researcher:
Aleix Alcacer (Universitat Jaume I).
Colaborators: Ximo Gual-Arnau, Marina Martínez-Garcia and Noelia Ventura-Campos from Universitat Jaume I, Rafael Benítez (Universitat de València), Rosa M. Crujeiras (Universidade de Santiago de Compostela), Adele Cutler (Utah State University), Daniel Fernández (Universitat Politècnica de Catalunya), Aurea Grané (Universidad Carlos III), Sebastian Mair (Linköping University), Morten Mørup (Technical University of Denmark) and Francesco Palumbo (Università degli Studi di Napoli Federico II).
Fund awarded: 484.590 euros.
Project duration: from 1st September 2024 to 31st August 2028
Funding agency: PROMETEO programme from Conselleria d’Educació, Cultura, Universitats i Ocupació de la Generalitat Valenciana
A new unsupervised statistical learning technique called biarchetype analysis (biAA) has been recently defined by our group for continuous multivariate data. Archetype analysis was extended to find the archetypes of both observations and features simultaneously. The idea of this new exploratory tool is to represent observations and features by instances of pure types (biarchetypes) that can be easily interpreted as they are mixtures of observations and features. Furthermore, the observations and features are expressed as mixtures of the biarchetypes, which also helps understand the structure of the data. BiAA offers advantages over biclustering, especially in terms of interpretability. This is because byarchetypes are extreme instances, which favors human understanding, as opposed to the centroids returned by biclustering.
Our goals are to extend biAA for missing data, functional data, tensor data, nominal and ordinal data, to define robust biAA and non-linear biAA by kernelization or deep learning. Furthermore, our goals also include extending classical archetype analysis to other types of data, such as mixed data and complex data like directional data, trees, etc. and to define and propose methodologies for regularized archetypal analysis and fair AA. Other objectives are to propose methodologies based on archetype analysis to solve other kinds of statistical learning problems such as, to detect and visualize outliers in high-dimensional functional data and to classify functions using local archetypal analysis (functional classwise archetypal analysis). The software developed will be available free and open.
These techniques will be applied to different machine learning problems, ranging from collaborative filtering to neuroimaging, computer vision, document analysis, community detection, etc., but also mining big data from health and sustainability fields.
These tools are attractive because their results are easily interpretable, even for non-experts.
Purpose of PROMETEO call from Conselleria d’Educació, Cultura, Universitats i Ocupació de la Generalitat Valenciana: "concesión de subvenciones con la finalidad de identificar y respaldar a grupos de I+D+I de excelencia de la Comunitat Valenciana, potenciar su proyección internacional y la transferencia de conocimiento".