Research Projects and Actions

January 2020 - December 2022. The MT4All CEF action aimed to provide bilingual resources (bilingual dictionaries and machine translation systems) for the under-resourced languages in fields of public interest at the EU level, such as e-Health and e-Justice.

June 2020 - August 2022. PRINCIPLE stands for Providing Resources in Irish, Norwegian, Croatian and Icelandic for Purposes of Language Engineering. PRINCIPLE produced high-quality curated LRs in order to improve translation quality in the Digital Service Infrastructures of eJustice and eProcurement via domain-specific Machine Translation (MT) systems (CEF eTranslation engines).


May 2017 - April 2019 (Research Fellow). H2020 Marie Skłodowska-Curie COFUND Action postdoctoral programme.

4. enetCollect – European Network for Combining Language Learning with Crowdsourcing Techniques

April 2017 - April 2019. EnetCollect is a large international network funded as a COST Action. It aims at performing the groundwork to set into motion a Research and Innovation trend combining the well-established domain of Language Learning with recent and successful crowdsourcing approaches. By doing so, enetCollect aims at unlocking a crowdsourcing potential available for all languages and at triggering an innovation breakthrough for the production of language learning material, such as lesson or exercise content, and language-related datasets such as, among others, NLP language resources. 

5. INTERACT - International Network on Crisis Translation

April 2017 - present. INTERACT is an H2020 Marie Skłodowska Curie Research and Innovation Staff Exchange Network aimed at researching translation in crisis scenarios. It brings together a unique combination of actors from social science, humanities, technology developers and humanitarian responders to collaborate and to educate each other.

6. PARSEME - PARSing and Multiword Expressions

September 2013 - April 2017. The IC1207 COST Action, PARSEME, is an interdisciplinary scientific network devoted to the role of multi-word expressions (MWEs) in parsing. It gathers interdisciplinary experts (linguists, computational linguists, computer scientists, psycholinguists, and industrials) from 29 countries which have signed the Memorandum of Understanding. It represents 28 languages and 6 dialects from 9 language families. It covers different parsing  frameworks: CCG (Combinatory Categorial Grammar), DG (Dependancy Grammar), GG (Generative Grammar), HPSG (Head-driven Phrase Structure Grammar), LFG (Lexical Functional Grammar), TAG (Tree Adjoining Grammar), ... It addresses different methodologies (symbolic, probabilistic and hybrid parsing) and language technology applications (machine translation, information retrieval, ...).

7. EXPERT - EXPloiting Empirical appRoaches to Translation

April 2015 - September 2016. EXPERT aims to train young researchers, namely Early Stage Researchers (ESRs) and Experienced Researchers (ERs), to promote the research, development and use of hybrid language translation technologies. The overall objective of EXPERT is to provide innovative research and training in the field of Translation memory and Machine Translation Technologies to 15 Marie Curie Fellows.

Nov. 2014 - April 2015. CLARINO is a Norwegian infrastructure project jointly funded by the Research Council of Norway and a consortium of Norwegian universities and research institutions. The ultimate aim is to make existing and future language resources easily accessible for researchers and to bring eScience to humanities disciplines.

9. DASISH - Data Service Infrastructure for the Social Sciences and the Humanities

Apr. 2014 - Dec. 2014. DASISH brings together all 5 ESFRI research infrastructure initiatives in the social sciences and humanities (SSH): CLARIN, DARIAH, CESSDA, ESS and SHARE. The goal of DASISH is to determine areas of cross-fertilization and synergy in the infrastructure development all five communities are entering into as of the beginning of 2012 and to work on concrete joint activities related to data, such as data access, data sharing, data quality, and data archiving. Synergy can also be achieved by working together on solutions regarding legal and ethical aspects.

10. CLARA - Initial Training Network for Common Language Resources and their Applications

May 2011 - Nov. 2013. Early Stage Researcher (ESR). Marie Curie programme of the EU 7th Framework Programme (Marie Curie Initial Training Network 7FP-ITN-238405). The objective of the CLARA network is to launch the training of a new generation of experts in linguistics that can develop methods of research for the construction, the use and the applications of language resources. The scientific objectives of CLARA are to go in greater depth into the creation of linguistic models based on real data that are then analysed with statistical and machine-learning tools, and on the hybridisation of techniques and methods of analysis.

11. CLARIN - Common Language Resources and Technologies Infrastructure

Sept. 2008 - July 2010. CLARIN is committed to establish an integrated and interoperable research infrastructure of language resources and its technology. It aims at lifting the current fragmentation, offering a stable, persistent, accessible and extendable infrastructure and therefore enabling eHumanities. In Spain it was co-funded by the 7FP of the EU (FP7-INFRASTRUCTURES-2007-1-212230), the Spanish Ministerio de Educación y Ciencia (CAC-2007-23) and the Spanish Ministerio de Ciencia e Innovación (ICTS-2008-11 and ACI2009-0995).

12.  FLaReNeT - Fostering Language Resources Network

Sept. 2008 - July 2010. Flarenet is a networking organization whose aims are devising and promoting consensual recommendations concerning the future development, deployment and use of LRs. Flarenet will indicate best practices and best policies for coordinating future actions and projects. The major activities of the Network will be to survey, analyse, classify LRs and relevant standards, together with their organisational and economic models, and discuss with major stakeholders and players upon new common strategies for a capillary deployment and use of LRs in real-world products. It was funded by the eContentplus program of the European Union (ECP2007LANG617001).

13. TACOC - Traducció Automàtica de Codi Obert per al català

May 2006 - Dec. 2006. Development of machine translation modules for Catalan-French, Catalan-Aranès and Catalan-English for the open source shallow-transfer machine translation platform Apertium developed by the Group Transducens, University of Alicante. Funded by DURSI, Generalitat de Catalunya (Catalan Government).

14. LIRICS - Linguistic Infrastructure for Interoperable Resources and Systems 

Jan. 2005 - June 2005. The key objective of LIRICS was to provide the European content and language industries with a common and stable set of formats, in the form of ISO standards, enabling interoperability and reuse of multilingual language resources, digital content and language engineering software.  Funded by the e-content program of the European Union (EDC-22236).