Projects
Current Projects
Current Projects
Early Detection of HEalth Risks by textual analysis of MEDical documents
Referencia: PID2022-136522OB-C21Duración: 2023-2026 (3 años) Entidad financiadora: Ministerio de Ciencia e InnovaciónConvocatoria: Proyectos I+D+i en el marco del programa estatal de Generación del Conocimiento y fortalecimiento científico y tecnología del sistema I+D+i - 2022 Entidades Participantes: Universidad Nacional de Educación a Distancia (UNED), Universidad del País Vasco (UPV)Entidad Coordinadora: Universidad Nacional de Educación a Distancia (UNED)Investigadores principales por la UNED: Juan Martínez Romo y Lourdes Araujo SernaEarly Detection of HEalth Risks by textual analysis of MEDical documents
The EDHER-MED project is multidisciplinary and merges the fields of health and informatics, including the component of early detection. Based on the data available in various medical documents our hypothesis is that several advanced tools can be developed to help identify indications of the presence of the problems under consideration that can alert physicians. We will also advance in the improvement of information representation models for the biomedical domain. We also propose the enrichment of medical ontologies, in which Spanish is less represented, as well as the automatic generation of clinical argumentation to support or oppose a given hypothesis, the definition and refinement of argument units to facilitate explainability, and the automatic generation of timelines to predict future diagnoses or patient admissions. Specifically, we intend to address the study of the following health problems: - The early detection of mental health problems in children and adolescents, with special attention to suicide. - Early detection of HIV will also be addressed by exploring different techniques to extract key indicators of HIV status. - Research will also be carried out to improve the characterization of rare diseases and their effects on the mental health and well-being of children. - Finally, the last use case contemplated in this project corresponds to the automatic discovery of potential risk factors associated with cardiovascular complications.
Digital OBSERvatory of MENtal Health in social networks for Healthcare Institutions based on Language Technologies
Referencia: TED2021-130398B-C21Duración: 2022-2025 (3 años)Entidad financiadora: Ministerio de Ciencia e Innovación.Convocatoria: Proyectos Estratégicos Orientados a la Transición Ecológica y a la Transición Digital - 2021Entidades Participantes: Universidad Nacional de Educación a Distancia (UNED), Universidad del País Vasco (UPV)Entidad Coordinadora: Universidad Nacional de Educación a Distancia (UNED)Investigadores principales por la UNED: Juan Martínez Romo y Lourdes Araujo SernaDigital OBSERvatory of MENtal Health in social networks for Healthcare Institutions based on Language Technologies
Social networks such as Twitter, Instagram, SnapChat or Facebook host thematic communities related to psychological disorders, where thousands of people enter to share their emotional state, to provide or seek help, or simply to chat. Parallel to this world of social networks is the group of psychiatry and psychology professionals whose work would be essential to help most of the people who participate in these communities. The proposal of this project is precisely to link these two worlds by building a mental health observatory that will allow health professionals to make a faster and more informed digital transition. The project focuses on issues such as depression, anxiety, the potential risk of suicide (subproject GELP) and situations of loneliness and isolation (subproject EDHIA), problems that are differentiated with respect to gender, a characteristic that we also want to analyze.
Métodos de la Lingüística Computacional para la legibilidad y simplificación automática en humanidades digitales
PID2020-116001RB-C32Duración: 2021-2024 (3 años) Entidad financiadora: Ministerio de Ciencia e Innovación. CONVOCATORIA 2020 DE «PROYECTOS I+D+i» EN EL MARCO DEL PROGRAMA ESTATAL DE GENERACIÓN DE CONOCIMIENTO Y FORTALECIMIENTO CIENTÍFICO Y TECNOLÓGICO DEL SISTEMA DE I+D+i Y DEL PROGRAMA ESTATAL DE I+D+i ORIENTADA A LOS RETOS DE LA SOCIEDAD Entidades participantes en el proyecto coordnado CLARA-NLP: UAM (PID2020-116001RB-C31, Sub-área INF), Universidad Autónoma de Madrid, Facultad de Filosofía y Letras; UNED, ETSI Informática; CSIC (PID2020-116001RA-C33, Sub-área INF), Instituto de Lengua, Literatura y Antropología (ILLA). Investigador principal por la UNED: Ana García Serrano. IP del proyecto coordenado: Antonio Moreno Sandoval (UAM) Métodos de la Lingüística Computacional para la legibilidad y simplificación automática en humanidades digitales
x
Inclusive Memory (Erasmus KA)
Inclusive Memory (Erasmus KA)
Inclusive Museums for well-being and health through the creation of a new shared memory
Inclusive Museums for well-being and health through the creation of a new shared memory
Duration: 1/11/2021 – 1/11/2024 (36 meses) Project n. 2021-1-IT02-KA220-HED-000031991. Partners: UNIVERSITA DEGLI STUDI DI MODENA E REGGIO EMILIA (Italy, MODENA), UNED (Spain, Madrid), Zètema Progetto Cultura srl (Italy, Roma), UNIVERSIDADE ABERTA (Portugal, Lisboa), HASKOLI ISLANDS (Iceland, REYKJAVIK), INTER ALIA (Greece, ATHINA), INSTITUT CATALA DE LA SALUT (Spain, BARCELONA). Funds: euros. Investigadora Principal del proyecto por la UNED: Covadonga Rodrigo
x
Past Projects
Past Projects
Spektrum (Erasmus+ KA205)
2019-3-PL01-KA205-077866.Action 2 - strategic partnerships. https://mnk.pl/article/spectrum-project Duration: 16/03/2020 – 15/03/2022 (24meses)Participantes: Muzeum Narodowe w Krakowie, Poland, PL (E10206701); Universita degli studi Roma Tre IT (E10208847); UNED ES (E10208821); Faro. Vlaams Steunpunt voor cultureel erfgoed vzw BE (E10212964). Outside in Pathways, UK (940634900) Investigadora Principal del proyecto por la UNED: Covadonga RodrigoSpektrum (Erasmus+ KA205)
x
Duration: 1/10/2020 - 30/9/2021Financing Institution: Fondo Supera COVID-19 (CRUE - CSIC - Banco Santander)
Muchos investigadores biosanitarios de todo el mundo están dirigiendo sus esfuerzos hacia el estudio de la COVID-19. Este esfuerzo genera un gran volumen de publicaciones científicas y a una velocidad que dificulta la adquisición efectiva de nuevo conocimiento. Se necesitan Sistemas de Información que asistan a los expertos biosanitarios en el acceso, consulta y análisis de estas publicaciones. Este es, precisamente, el objetivo general del proyecto VIGICOVID.
MISMIS
Duration: 1/01/2019 - 31/12/2021Financing Institution: Ministerio de Ciencia, Innovación y Universidades, convocatoria 2018 de Proyectos de I+D de generación de conocimiento del programa estatal de generación de conocimiento y fortalecimiento científico y tecnológico del sistema de I+D+i. Referencia PGC2018-096212-B-C32. MISMIS
Desinformación y agresividad en Social Media: bias, controversia y veracidad.
En este proyecto se diseñarán algoritmos para la identificación de relaciones relevantes entre distintas enfermedades, o entre enfermedades y otros conceptos médicos, que sirvan de ayuda a la realización de diagnósticos. Estas relaciones se pueden codificar como Reglas de Asociación (RA), que representan el conocimiento médico subyacente en el conjunto de HCE almacenadas en el repositorio de información clínica. Sin embargo, la extracción y selección de RA de alta probabilidad no es un proceso sencillo. Este proyecto es continuación del proyecto EXTRAE (IMIENS2017) en el que se han realizado importantes avances en la selección de reglas de asociación relevantes. En particular se ha desarrollado un nuevo algoritmo de aprendizaje semisupervisado que es capaz de proporcionar resultados de alta precisión con una cantidad muy reducida de datos de entrenamiento. Este algoritmo ha dado lugar a un sistema que se ha evaluado sobre un pequeño conjunto de datos médicos. Como continuación de este trabajo, en esta nueva propuesta perfeccionaremos el algoritmo desarrollado y lo generalizaremos para trabajar con un conjunto de datos (cohorte) extenso y mucho más específico, basado en un dominio concreto del conocimiento médico y extraído de repositorios especializados en uso secundario (investigación) con datos reales y codificados procedentes de hospitales colaboradores.
Learning to Interact with Humans by Lifelong Interaction with Humans
Duration: 2018-2020Financing Institution: EU (CHIST-ERA 2016) Learning to Interact with Humans by Lifelong Interaction with Humans
The LIHLITH project is a fundamental pilot research project which introduces a new lifelong learning framework for the interaction of humans and machines on specific domains.A Lifelong Learning system learns different tasks sequentially, over time, getting better at solving future related tasks based on past experience. LIHLITH will focus on human-computer dialogue, where each dialogue experience is used by the system to learn to better interact, based on the success (or failure) of previous interactions. The key insight is that the dialogue will be designed to produce a reward, allowing the chatbot system to know whether the interaction was successful or not. The reward will be used to train the domain and dialogue management modules of the chatbot, improving the performance, and reducing the development cost, both on a single target domain but specially when moving to new domains.
En este proyecto nos proponemos diseñar algoritmos que ayuden a la identificación de relaciones relevantes entre distintas enfermedades. Esta información es muy útil para realizar nuevos diagnósticos, probar nuevos tratamientos o fármacos, o para prever la posible evolución de la enfermedad, etc. . Muchas enfermedades comparten uno, o varios aspectos, como síntomas, evolución, tratamiento, etc., pero esto no siempre significa que exista una relación entre ellas. Por ello, lo que proponemos es un sistema capaz de detectar relaciones entre enfermedades que se pueden considerar significativas. La significatividad vendrá dada por la coincidencia de aspectos más allá de la casualidad que se capturará definiendo un modelo estadístico apropiado. Las relaciones entre distintas enfermedades se pueden establecer en base a distintos patrones, separada o conjuntamente: aparición conjunta, síntomas comunes, similitudes de tratamientos, etc. Estas relaciones entre enfermedades se pueden codificar como Reglas de Asociación (RA), que se pueden considerar formas de representar el conocimiento médico subyacente en el conjunto de HCE almacenadas en el repositorio de información clínica. Este proyecto se enmarca en la Convocatoria IMIENS de Ayudas para la realización de Proyectos de Investigación Conjuntos entre grupos de investigación de la UNED y el Instituto de Salud Carlos III.
PLN.NET
Duration: 2016-2018Financing Institution: Ministerio de Economía, Industria y CompetitividadPLN.NET
Red temática de excelencia financiada por el Ministerio de Economía, Industria y Competitividad (referencia TIN2016-81739-REDT) para la creación de foros de comunicación vivos entre los investigadores del Procesamiento del Lenguaje Natural, donde llegar a puntos de encuentro en el proceso de estandarización de sus servicios.El grupo NLP&IR es uno de los integrantes, para más información:https://gplsi.dlsi.ua.es/pln/node/32
VEMODALEN
Duration: 2016-2019Financing Institution: Ministerio de Economía y CompetitividadConvocatoria: 2015, Modalidad 1: Proyectos DE I+D+I, del Programa Estatal de Investigación, Desarrollo e Innovación Orientada a los Retos de la Sociedad.VEMODALEN
For an average citizen of our digital era, the problem is no longer finding relevant information, but assimilating the massive amount of relevant available information at any moment in time. This is not possible without the help of a new generation of machines able to digest all relevant sources into a readable, personalized synthesis of the stream of relevant information. And such machines need to acquire two crucial, interdependent skills: (i) the ability to automatically discern when different texts convey approximately the same message; and (ii) the ability to discern the credibility of messages.Our goal is to address the challenge of computing both textual similarity and source authority in online media, focusing on three different and challenging tasks in three relevant application scenarios: Identification and synthesis of controversy in the medical domain, Generation of reputation profiles for companies/brandS and Recommendation of instructional materials in e-learning environments.
MANTRA-MED
MANTRA-MED
Modelado y AutoMatización de exTracción de Relaciones y cAtegorización de informes MEDicos para la recomendación de códigos CIE-10 (TIN2016-77820-C3-2-R)
Duration: 2017-2018 Financing institution: Ministerio de Economía y CompetitividadModelado y AutoMatización de exTracción de Relaciones y cAtegorización de informes MEDicos para la recomendación de códigos CIE-10 (TIN2016-77820-C3-2-R)
The automatic processing of Electronic Medical Records (EMR) poses challenges for the field of Natural Language Processing (NLP) which to a great extent are related to the adaptation of existing techniques to the domain of medicine. On the other hand, tasks such as assigning diagnostic codes and procedures to the EMRs, carried out manually by experts, raise the question of the need to explore and suggest Text Mining and Information Recovery techniques which allow for automatic inference of the relevant codes for EMR descriptions.
The Alcorcon Foundation University Hospital (HUFA in Spanish), with which the sub-project will collaborate, is a public university hospital which is part of the Madrid Health Service (SERMAS in Spanish). Like all Madrid Health Service hospitals it moved from the old CIE-9 discharge report coding scheme to the newer CIE-10 scheme on 1 January 2016. This change has resulted in a 75% decrease in coding team performance. Said teams are made up of personnel trained for the task. There are commercial applications available which aid in assigning CIE-10 codes by using existing mapping between CIE-9 and CIE-10. Nevertheless, the greater detail and comprehensiveness of CIE-10, combined with the fact that there are combination codes present in CIE-9 with no corresponding code in CIE-10, makes this mapping impossible in a large number of cases. All hospitals would benefit from having a tool which is able to automatically assign codes to diagnostics and procedures directly from the free text found in medical reports. This health-sector related problem will be the main focus and use case of this subproject.
We propose to a study, adapt and develop NLP and unsupervised learning techniques - which this group has a great deal of experience with in order to develop a tool which recommends and assigns CIE-10 codes to discharge reports. An unsupervised approach is imperative with the current limited availability of manually written records to train supervised systems with. As records written in Spanish will be readily available, we will focus on this language, although the methods can be applied to other languages and it is expected that the methods will be validated by the work done with other languages on the coordinated project.
The development of this tool encompasses investigative challenges of several diverse fields: anonymization of reports, lexical normalization within the domain, disambiguation of domain acronyms, representation of the documents, identification of concepts/expressions, extraction of relationships, structured information recovery and unsupervised learning. The use of unsupervised learning techniques will be studied in order to categorize discharge reports with CIE-10 codes, assessing data modeling by means of distributed representations with deep learning algorithms and Information Retrieval techniques. Likewise, statistical models will be applied in order to identify the underlying relationships among reports written with CIE-10 codes. This knowledge base of relationships will make it possible to recommend codes for new reports. The ideal method for combining the different code recommendation algorithms will beanalyzed by studying techniques based on automatic and heuristic learning.
The Alcorcon Foundation University Hospital (HUFA in Spanish), with which the sub-project will collaborate, is a public university hospital which is part of the Madrid Health Service (SERMAS in Spanish). Like all Madrid Health Service hospitals it moved from the old CIE-9 discharge report coding scheme to the newer CIE-10 scheme on 1 January 2016. This change has resulted in a 75% decrease in coding team performance. Said teams are made up of personnel trained for the task. There are commercial applications available which aid in assigning CIE-10 codes by using existing mapping between CIE-9 and CIE-10. Nevertheless, the greater detail and comprehensiveness of CIE-10, combined with the fact that there are combination codes present in CIE-9 with no corresponding code in CIE-10, makes this mapping impossible in a large number of cases. All hospitals would benefit from having a tool which is able to automatically assign codes to diagnostics and procedures directly from the free text found in medical reports. This health-sector related problem will be the main focus and use case of this subproject.
We propose to a study, adapt and develop NLP and unsupervised learning techniques - which this group has a great deal of experience with in order to develop a tool which recommends and assigns CIE-10 codes to discharge reports. An unsupervised approach is imperative with the current limited availability of manually written records to train supervised systems with. As records written in Spanish will be readily available, we will focus on this language, although the methods can be applied to other languages and it is expected that the methods will be validated by the work done with other languages on the coordinated project.
The development of this tool encompasses investigative challenges of several diverse fields: anonymization of reports, lexical normalization within the domain, disambiguation of domain acronyms, representation of the documents, identification of concepts/expressions, extraction of relationships, structured information recovery and unsupervised learning. The use of unsupervised learning techniques will be studied in order to categorize discharge reports with CIE-10 codes, assessing data modeling by means of distributed representations with deep learning algorithms and Information Retrieval techniques. Likewise, statistical models will be applied in order to identify the underlying relationships among reports written with CIE-10 codes. This knowledge base of relationships will make it possible to recommend codes for new reports. The ideal method for combining the different code recommendation algorithms will beanalyzed by studying techniques based on automatic and heuristic learning.
Museología e integración social: la difusión del patrimonio artístico y cultural del Museo del Prado a colectivos con especial accesibilidad (invidentes, sordos y reclusos)
Duration: 2016-2018 Financing institution: Convocatoria 2015 de Programas de Actividades de I+D entre Grupos de investigación de la Comunidad de Madrid, organizada por la Dirección General de Universidades e Investigación de la Consejería de Educación, Juventud y Deporte, en la Comunidad de Madrid. (S2015/HUM3494)Museología e integración social: la difusión del patrimonio artístico y cultural del Museo del Prado a colectivos con especial accesibilidad (invidentes, sordos y reclusos)
The work is structured around three focal points of attention: the first will detect the specific needs and interests of different groups; the second will deal with the design and the creation of applications, systems and virtual exhibitions adapted for these three groups, from some virtual thematic tours or visits of the Museo del Prado; finally, the third focus will seek to invigorate an international network that relates the social projection of museology and its application to the accessibility of the culture to specific groups, all of it through the development of the new technological commodities.The concern about the patrimonial dimension of the Community of Madrid, especially the art collection of the Museo del Prado, leads us to consider the museum as "cultural artifact" that goes beyond its investigative and conservative function, to seek to bring the museum to the viewer, whatever its diversity and condition, making it a sharer of the contact with the artistic reality and inviting him not only to a direct contemplation of a work of art, but to an interaction with the institution and its collections, with the purpose of exceeding the barrier of the sacredness of the works of art and saving the elitist character that the nineteenth century perception of the traditional collections can suppose.
EXTracción de RElaciones entre Conceptos Médicos en fuentes de información heterogéneas
Duration: 2014-2017Financing institution: MINECO (TIN2013-46616-C2-2-R)EXTracción de RElaciones entre Conceptos Médicos en fuentes de información heterogéneas
The overall objective of this project is to address the generation of techniques and tools to allow efficient and intelligent access to the contents of medical documents of multilingual nature such as i) general scientific documents, ii) medical records and iii) general information on the Internet. The project will demonstrate, through a series of use cases, the benefits of the application of language technology in the health sector, using advanced Natural Language Processing techniques such as information retrieval applied to large amounts of resources about medical information on the Internet.
Voxpopuli
Voxpopuli
Duration: 2014-2016 Financing institution: Ministerio de Economía y Competitividad (TIN2013-4709-C3-1P)
Online Reputation Management has recently become a fundamental aspect of Public Relations for organizations, personalities and entities in general. The very reason why the online dimension of reputation is now essential the fact that it is the biggest, richest and most updated source of information, opinions and attitudes around any entity it is the reason why a manual analysis of information streams in media and social networks is not viable. Automatic processing of online information crucially depends of the advancements in many research fields (data structures and algorithms for real time Natural Language Processing, Opinion Mining, Textual Synthesis, Novelty Detection and Recommendation, multimedia search, social network analysis, etc.) that, up to now, have paid little attention to the online reputation scenario. For instance, opinion mining has been focused on product reviews, and its results are not applicable to the (much more complex) problem of evaluating how the content of information streams in sial networks may affect the reputation of a company. The project aims towards the creation of a new generation of online reputation monitoring systems, able to understand, process, aggregate and synthesize, in real time, facts, opinions and attitudes around an entity, of presenting such information in multiple dimensions, and of interacting with reputation experts so that they can accomplish their task better and faster. Our research will go from fundamental problems such as textual similarity or data structures for real time Natural Language Processing to prototype validation with reputation experts. Besides algorithms and prototypes, we will also create and distribute test collections to evaluate all relevant technologies in the reputation management scenario.
Readers: Evaluation And DEvelopment of Reading Systems
Duration: 2013 - 2015Financing institution: EU (CHIST-ERA 2011) + Mineco (PCIN-2013-002-C02-01)Readers: Evaluation And DEvelopment of Reading Systems
The READERS project proposes new unsupervised computational models to automatically extract background knowledge after reading large amounts of unstructured text. This knowledge will be in the form of classes, categorized entities and predicates whose arguments are typified by probability distributions over classes. Classes themselves will be automatically organized into taxonomies related to the predicates in which they participate.
LiMoSINe
LiMoSINe
Linguistically Motivated Semantic Aggregation Engines
Duration: 2011-2014Financing institution: European Comission, FP7-ICTLinguistically Motivated Semantic Aggregation Engines
The LiMoSINe vision is to transition access to online information from a document-centric search paradigm focused on returning disconnected atomic pieces to a truly semantic aggregation paradigm. In this new paradigm, machines will understand a user's intent, discover and organize facts, identify opinions, experiences and trends, all from inherently multilingual online sources and open knowledge repositories. LiMoSINe's aggregation engines will automatically organize search results in semantically meaningful ways.
ELIAS
ELIAS
Evaluating Information Access Systems
Duration:2011-2016Financing institution: European Science FoundationEvaluating Information Access Systems
ELIAS will define a new measurement paradigm for the evaluation of search engines based on so-called living laboratories. This paradigm involves (i) exploitation of novel market places and forums where large numbers of users are recruited into early stage evaluation experiments to test a particular aspect of an information access system; and (ii) using operational systems as experimental platforms on which to conduct user-based experiments at scale.
The automatic encyclopedia of people and organizations.
Duration: 2010-2012Financing institution: MICINN (TIN2010-21128-C02)The automatic encyclopedia of people and organizations.
The main goal of the project is to develop algorithms, techniques and systems able to mine and aggregate information relative to people and organizations from unstructured and structured web sources, such as social networks, blogs, news, semantic web data, and websites in general.
Mejorando el Acceso, el Análisis y la Visibilidad de la Información y los Contenidos Multilingüe y Multimedia en Red para la Comunidad de Madrid
Duration: 2010-2013Financing institution: Regional Government of Madrid (S2009/TIC-1542)Mejorando el Acceso, el Análisis y la Visibilidad de la Información y los Contenidos Multilingüe y Multimedia en Red para la Comunidad de Madrid
Improving access, analysis and visibility of multilingual and multimedia Web contents.
Buscamedia
Duration: 2009-2012Financing institution: CDTI (CEN-20091026)Buscamedia
Development of a true Multimedia Semantic Search Engine.
Financing institution: Sub-contracts by Grupo ALMASummary: Online Reputation Managing
Quantitative Evaluation of Academic Websites Visibility
Duration: 2008-2010Financing institution: CICYT (TIN 2007-67581-C02-01)Quantitative Evaluation of Academic Websites Visibility
Automated Classification of academic websites by topic and language, in order to create ranks with them. The main goal of the project is to improve the accessibility and visibility of academic information on the World Wide Web.
Evaluation Best Practice and Collaboration for Multilingual Information Access
Evaluation Best Practice and Collaboration for Multilingual Information Access
Financing institution: European Commission
TrebleCLEF supports the development and consolidation of expertise in the multidisciplinary research area of multilingual information access (MLIA) and disseminates this knowhow to the application communities through a set of complementary activities.
Text-Mess (subproyecto INES)
Duration: 2007-2009Financing institution: CICYT (TIN2006-15265-C06-02)Text-Mess (subproyecto INES)
Multilingual/Multimedia Access To Cultural Heritage
Duration: 2006-2009Financing institution: European Commission, 6FP (STREP 033104)Multilingual/Multimedia Access To Cultural Heritage
MultiMatch plans to develop a multilingual search engine specifically designed for access, organisation and personalised presentation of cultural heritage information.
MAVIR
MAVIR
Mejorando el acceso y visibilidad de la información multilingüe en red para la Comunidad de Madrid
Duration: 2006-2009Financing institution: Comunidad de Madrid, IV PRICIT, (S-0505/TID/0267)Mejorando el acceso y visibilidad de la información multilingüe en red para la Comunidad de Madrid
MAVIR es una red de investigación formada por un equipo multidisciplinar de científicos, técnicos, lingüistas y documentalistas para desarrollar un esfuerzo integrador en las líneas de investigación, formación y transferencia de tecnología.
Quality Labelling of Medical Web Content using Multilingual Information Extraction.
Duration: 2006-2008Financing institution: European Commission (EC Programme: Public Health 61383)Quality Labelling of Medical Web Content using Multilingual Information Extraction.
Quality Labelling of Medical Web Content using Multilingual Information Extraction
SWIISA
SWIISA
Speech Web and Images Interactive Search Assitants
Speech Web and Images Interactive Search Assitants
Duration: 2006-2007
Financing institution: UNED
Estudio de aplicación de asistentes interactivos a tres línas: búsqueda translingüe sobre imágenes, sobre la Web y sobre transcripciones automáticas de reconocedores de habla.
R2D2 (subproyecto Syembra)
R2D2 (subproyecto Syembra)
Recuperación de Respuestas en Documentos Digitalizados
Recuperación de Respuestas en Documentos Digitalizados
Duration: 2003-2006
Financing institution: CICYT (TIC2003-07158-C04)
Evaluation of cross-lingual answer retrieval systems.
RIBIDI
RIBIDI
Recuperación de Información en Bibliotecas Digitales
Duration: 2001-2004Financing institution: CYTED VII.19Recuperación de Información en Bibliotecas Digitales
Cooperación iberoamericana en investigación y desarrollo de tecnologías para recuperación de información y bibliotecas digitales.
Cross-Language Evaluation Forum
Duration: 2001-2003Financing institution: European Commission, 5FP (IST-2000-31002)Cross-Language Evaluation Forum
Evaluation of Cross-Language Information Retrieval Systems for European Languages
ETB
ETB
European Schools Treasury Browser
Duration: 2000-2002Financing institution: European Commission, 5FP (IST Programme)European Schools Treasury Browser
Access to meta-information about educational resources and new technologies in Europe.
DELOS: a Network of Excellence on Digital Libraries
Duration: 2000-2002Financing institution: European Commission, IST ProgrammeDELOS: a Network of Excellence on Digital Libraries
The main objective of DELOS is to coordinate a joint programme of activities of the major European teams working in digital library related areas.
News Agencies Multilingual Information Categorization
Duration: 1999-2002Financing institution: European Commission, 5FP (IST-1999-12392)News Agencies Multilingual Information Categorization
NAMIC main objective is to develop and bring to marketable stage advanced NLP technologies for multilingual news customization and broadcasting throughout distributed services.
EuroWordnet
Duration: 1996-1999Financing institution: European Commission, 4FP (Telematics, LE 4003)EuroWordnet
The project aimed at building a multilingual lexical database with semantic relations between words in 8 european languages (Spanish, English, Italian, Dutch, French, German, Estonian and Czech). Every monolingual wordnet is linked to the others via an InterLingual Index derived from Wordnet 1.5.
A project under the auspices of ELSNET and ACO*HUM excellence networks to develop 6 specialization courses around Natural Language Processing and Speech Recognition and synthesis. Our task was to develop an open distance learning course on Natural Language Processing and Information Retrieval.
Multilingual named-entity recognition, hyperlinking, phrase extraction, summarization and semantic indexing for information access on a digital news archive.
RILE
RILE
Servidor de Recursos para el Desarrollo de la Ingeniería Lingüística en Español
Duration: 1999-2000Financing institution: M.I.N.E.R.Servidor de Recursos para el Desarrollo de la Ingeniería Lingüística en Español
The goal of RILE is to develop a pilot for a server with resources, tools and information related to the development of applications in the field of Language Engineering for Spanish.
Duration: 1996-1999Financing institution: CICyT (TIC96-1243-C03-01)
Development and integration of Language Engineering resources and tools for Spanish, Catalan, Basque and English and demonstration of such tools in a multilingual search engine with NLP capabilities.
The goal was to explore the utility of constructing a multilingual lexical knowledge base from machine-readable versions of conventional dictionaries by exploring the utility of machine readable textual corpora as a source of lexical information not coded in conventional dictionaries, and by adding dictionary publishing partners to exploit the lexical database and corpus extraction software developed by the projects for conventional lexicography.