Projects
Current Projects
CLARA-HD
Métodos de la Lingüística Computacional para la legibilidad y simplificación automática en humanidades digitales
PID2020-116001RB-C32Duración: 2021-2024 (3 años) Entidad financiadora: Ministerio de Ciencia e Innovación. CONVOCATORIA 2020 DE «PROYECTOS I+D+i» EN EL MARCO DEL PROGRAMA ESTATAL DE GENERACIÓN DE CONOCIMIENTO Y FORTALECIMIENTO CIENTÍFICO Y TECNOLÓGICO DEL SISTEMA DE I+D+i Y DEL PROGRAMA ESTATAL DE I+D+i ORIENTADA A LOS RETOS DE LA SOCIEDAD Entidades participantes en el proyecto coordnado CLARA-NLP: UAM (PID2020-116001RB-C31, Sub-área INF), Universidad Autónoma de Madrid, Facultad de Filosofía y Letras; UNED, ETSI Informática; CSIC (PID2020-116001RA-C33, Sub-área INF), Instituto de Lengua, Literatura y Antropología (ILLA). Investigador principal por la UNED: Ana García Serrano. IP del proyecto coordenado: Antonio Moreno Sandoval (UAM) Inclusive Memory (Erasmus KA)
Inclusive Museums for well-being and health through the creation of a new shared memory
Duration: 1/11/2021 – 1/11/2024 (36 meses) Project n. 2021-1-IT02-KA220-HED-000031991. Partners: UNIVERSITA DEGLI STUDI DI MODENA E REGGIO EMILIA (Italy, MODENA), UNED (Spain, Madrid), Zètema Progetto Cultura srl (Italy, Roma), UNIVERSIDADE ABERTA (Portugal, Lisboa), HASKOLI ISLANDS (Iceland, REYKJAVIK), INTER ALIA (Greece, ATHINA), INSTITUT CATALA DE LA SALUT (Spain, BARCELONA). Funds: euros. Investigadora Principal del proyecto por la UNED: Covadonga Rodrigo
Spektrum (Erasmus+ KA205)
2019-3-PL01-KA205-077866.Action 2 - strategic partnerships. https://mnk.pl/article/spectrum-project Duration: 16/03/2020 – 15/03/2022 (24meses)Participantes: Muzeum Narodowe w Krakowie, Poland, PL (E10206701); Universita degli studi Roma Tre IT (E10208847); UNED ES (E10208821); Faro. Vlaams Steunpunt voor cultureel erfgoed vzw BE (E10212964). Outside in Pathways, UK (940634900) Investigadora Principal del proyecto por la UNED: Covadonga RodrigoDuration: 1/10/2020 - 30/9/2021Financing Institution: Fondo Supera COVID-19 (CRUE - CSIC - Banco Santander)
MISMIS
Duration: 1/01/2019 - 31/12/2021Financing Institution: Ministerio de Ciencia, Innovación y Universidades, convocatoria 2018 de Proyectos de I+D de generación de conocimiento del programa estatal de generación de conocimiento y fortalecimiento científico y tecnológico del sistema de I+D+i. Referencia PGC2018-096212-B-C32. EXTRAE II
Duration: 2019-2021Financing Institution: IMIENSLearning to Interact with Humans by Lifelong Interaction with Humans
Duration: 2018-2020Financing Institution: EU (CHIST-ERA 2016) The LIHLITH project is a fundamental pilot research project which introduces a new lifelong learning framework for the interaction of humans and machines on specific domains.
A Lifelong Learning system learns different tasks sequentially, over time, getting better at solving future related tasks based on past experience. LIHLITH will focus on human-computer dialogue, where each dialogue experience is used by the system to learn to better interact, based on the success (or failure) of previous interactions. The key insight is that the dialogue will be designed to produce a reward, allowing the chatbot system to know whether the interaction was successful or not. The reward will be used to train the domain and dialogue management modules of the chatbot, improving the performance, and reducing the development cost, both on a single target domain but specially when moving to new domains.
Past Projects
EXTRAE
Duration: 2018-2019Financing Institution: IMIENSDuration: 2016-2018Financing Institution: Ministerio de Economía, Industria y Competitividad
VEMODALEN
Duration: 2016-2019Financing Institution: Ministerio de Economía y CompetitividadConvocatoria: 2015, Modalidad 1: Proyectos DE I+D+I, del Programa Estatal de Investigación, Desarrollo e Innovación Orientada a los Retos de la Sociedad.MANTRA-MED
Modelado y AutoMatización de exTracción de Relaciones y cAtegorización de informes MEDicos para la recomendación de códigos CIE-10 (TIN2016-77820-C3-2-R)
Duration: 2017-2018 Financing institution: Ministerio de Economía y CompetitividadThe Alcorcon Foundation University Hospital (HUFA in Spanish), with which the sub-project will collaborate, is a public university hospital which is part of the Madrid Health Service (SERMAS in Spanish). Like all Madrid Health Service hospitals it moved from the old CIE-9 discharge report coding scheme to the newer CIE-10 scheme on 1 January 2016. This change has resulted in a 75% decrease in coding team performance. Said teams are made up of personnel trained for the task. There are commercial applications available which aid in assigning CIE-10 codes by using existing mapping between CIE-9 and CIE-10. Nevertheless, the greater detail and comprehensiveness of CIE-10, combined with the fact that there are combination codes present in CIE-9 with no corresponding code in CIE-10, makes this mapping impossible in a large number of cases. All hospitals would benefit from having a tool which is able to automatically assign codes to diagnostics and procedures directly from the free text found in medical reports. This health-sector related problem will be the main focus and use case of this subproject.
We propose to a study, adapt and develop NLP and unsupervised learning techniques - which this group has a great deal of experience with in order to develop a tool which recommends and assigns CIE-10 codes to discharge reports. An unsupervised approach is imperative with the current limited availability of manually written records to train supervised systems with. As records written in Spanish will be readily available, we will focus on this language, although the methods can be applied to other languages and it is expected that the methods will be validated by the work done with other languages on the coordinated project.
The development of this tool encompasses investigative challenges of several diverse fields: anonymization of reports, lexical normalization within the domain, disambiguation of domain acronyms, representation of the documents, identification of concepts/expressions, extraction of relationships, structured information recovery and unsupervised learning. The use of unsupervised learning techniques will be studied in order to categorize discharge reports with CIE-10 codes, assessing data modeling by means of distributed representations with deep learning algorithms and Information Retrieval techniques. Likewise, statistical models will be applied in order to identify the underlying relationships among reports written with CIE-10 codes. This knowledge base of relationships will make it possible to recommend codes for new reports. The ideal method for combining the different code recommendation algorithms will beanalyzed by studying techniques based on automatic and heuristic learning.
Museología e integración social: la difusión del patrimonio artístico y cultural del Museo del Prado a colectivos con especial accesibilidad (invidentes, sordos y reclusos)
Duration: 2016-2018 Financing institution: Convocatoria 2015 de Programas de Actividades de I+D entre Grupos de investigación de la Comunidad de Madrid, organizada por la Dirección General de Universidades e Investigación de la Consejería de Educación, Juventud y Deporte, en la Comunidad de Madrid. (S2015/HUM3494)EXTracción de RElaciones entre Conceptos Médicos en fuentes de información heterogéneas
Duration: 2014-2017Financing institution: MINECO (TIN2013-46616-C2-2-R)Voxpopuli
Duration: 2014-2016 Financing institution: Ministerio de Economía y Competitividad (TIN2013-4709-C3-1P)
Readers: Evaluation And DEvelopment of Reading Systems
Duration: 2013 - 2015Financing institution: EU (CHIST-ERA 2011) + Mineco (PCIN-2013-002-C02-01)Linguistically Motivated Semantic Aggregation Engines
Duration: 2011-2014Financing institution: European Comission, FP7-ICTEvaluating Information Access Systems
Duration:2011-2016Financing institution: European Science FoundationThe automatic encyclopedia of people and organizations.
Duration: 2010-2012Financing institution: MICINN (TIN2010-21128-C02)Mejorando el Acceso, el Análisis y la Visibilidad de la Información y los Contenidos Multilingüe y Multimedia en Red para la Comunidad de Madrid
Duration: 2010-2013Financing institution: Regional Government of Madrid (S2009/TIC-1542)Financing institution: Sub-contracts by Grupo ALMASummary: Online Reputation Managing
Quantitative Evaluation of Academic Websites Visibility
Duration: 2008-2010Financing institution: CICYT (TIN 2007-67581-C02-01)Evaluation Best Practice and Collaboration for Multilingual Information Access
Financing institution: European Commission
Text-Mess (subproyecto INES)
Duration: 2007-2009Financing institution: CICYT (TIN2006-15265-C06-02)Multilingual/Multimedia Access To Cultural Heritage
Duration: 2006-2009Financing institution: European Commission, 6FP (STREP 033104)Mejorando el acceso y visibilidad de la información multilingüe en red para la Comunidad de Madrid
Duration: 2006-2009Financing institution: Comunidad de Madrid, IV PRICIT, (S-0505/TID/0267)Quality Labelling of Medical Web Content using Multilingual Information Extraction.
Duration: 2006-2008Financing institution: European Commission (EC Programme: Public Health 61383)SWIISA
Speech Web and Images Interactive Search Assitants
Duration: 2006-2007
Financing institution: UNED
R2D2 (subproyecto Syembra)
Recuperación de Respuestas en Documentos Digitalizados
Duration: 2003-2006
Financing institution: CICYT (TIC2003-07158-C04)