Ongoing Projects

Linking Linguistics to Data Science - LL2DS


Date: November 2022 - February 2024

The project Linking Linguistics to Data Science (LL2DS), submitted to the program Erasmus Mundus Design Measures (ERASMUS-EDU-2022-EMJM-DESIGN), aims at designing a fully integrated Erasmus Mundus Joint Master Degree (EMJMD) innovative amongst programmes around the world in its novel approach to linking Linguistics to Data.


NOVA University of Lisbon (Portugal)

Università Cattolica del Sacro Cuore of Milan (Italy)

University of Zaragoza (Spain)

University of Galway (Ireland)

National Institute for Research in Digital Science and Technology – INRIA (France)

Team: Rute Costa (coordinator), Sara Carvalho, Raquel Amaro, Marco Passarotti, Jorge Garcia, John McCrae, Laurent Romary.

MORDigital - Digitisation of Diccionario da Lingua Portugueza by António de Morais Silva

FCT project: PTDC/LLT-LIN/6841/2020

Date: November 2022 - February 2024

The main goal of MORDigital is to encode the selected editions of Diccionario de Lingua Portugueza by António de Morais Silva (MOR), first published in 1789. MORDigital aims to promote accessibility to cultural heritage while fostering reusability, and to contribute towards a greater presence of lexicographic digital content in Portuguese by means of open tools and standards. MOR represents a great legacy, since it marks the beginning of Portuguese dictionaries, having served as a model for all subsequent lexicographic production throughout the 19th and 20th centuries. MORDigital follows a new paradigm in Lexicography, which results from the convergence between Lexicography, Terminology, Computational Linguistics, and Ontologies as an integral part of Digital Humanities and Linked (Open) Data, aligned with the FAIR principles. In the Portuguese context, this research fills a gap with regard to searchable online retrodigitised dictionaries, built on current standards and methodologies which promote data sharing and harmonisation, namely TEI Lex-0, LMF and Ontolex-Lemon. The team will further ensure the connection to other existing systems and lexical resources, particularly in the Portuguese-speaking world.

Team: Rute Costa, Ana Salgado, Sara Carvalho, Bruno Almeida, Raquel Silva, Margarida Ramos, Laurent Romary, Toma Tasovac, Mohamed Khemakhem, Fahad Khan, Jorge Garcia, Filomena Gonçalves.



This project responds to the current situation associated with Covid-19 epidemic and its impact on university teaching.

It turns out that when moving teaching to the online environment it is necessary to develop new ways of working with students, namely with an increase in students’ independent research and study activities. To move to this type of work, it is necessary to motivate students with the increased social impact of their activities.

Therefore, this project proposes the development of an electronic publication, with ISBN number, by each student in which they will present basic issues and questions for discussion.

Students from the participating universities will comment on the content and method of processing the topic, proposing revisions.

The resulting next-books are then made available to the media and literary portals for reading by the general public.

This working method emphasizes international cooperation in motivating students to go deeper into the subjects studied and strengthens their foreign language skills.


The aim of this project is to adapt the scientific language of gene therapy to a language understandable by the general public. The adaptation of the scientific language of gene therapy to an understandable language will be carried out through a partnership between NOVA University Lisbon and the company Pfizer Portugal, which will enable the collaboration between this project team and the Portuguese Hemophilia Association. More specifically, the project is based at the School of Social Sciences and Humanities at NOVA University Lisbon (NOVA FCSH), and the coordinating research unit is NOVA Institute of Philosophy (IFILNOVA) through the responsible researcher Maria Grazia Rossi and the researcher Dima Mohammed. The goal of the project will be achieved with the collaboration of the Linguistics Research Centre of NOVA University Lisbon (NOVA CLUNL) and NOVA Institute of Communication (ICNOVA). The project will also have the support of two institutions outside NOVA FCSH: the NOVA Faculty of Science and Technology (NOVA FCT) and the Collaborative Laboratory Value for Health – CoLAB. It is intended that this adaptation will be implemented in several documents, such as a brochure on gene therapy or a website.

FUNDING: Pfizer Portugal | 2021-2022

Com @Rehab Project - Communication for interactive rehabilitation in virtual reality - by researchers NOVA School of Science and Technology, School of Social and Human Sciences of NOVA and Nova Medical School.

The Com @Rehab project develops the Digital Communication Module (MCD Rehab) of VR4Pandemic which currently includes three components: i) a glove with haptic feedback and biosensors, ii) a Virtual Reality (VR) game with levels of difficulty and iii) a platform that analyses the physiological parameters in real-time. The purpose of the project is to contribute to the rehabilitation of post-COVID patients in hospitals and/or home environments. The researchers intend to focus on the development of communicative skills of the various agents involved while enhancing technological literacy and interactivity with technology, thereby helping to improve human-human and human-machine interaction.

ELEXIS – European Lexicographic Infrastructure

Horizon 2020 – ID 731015


An enormous amount of European institutions, publishers, universities and communities have been developing dictionaries and/or dictionary data.

Although confronted with similar problems related to producing and making these resources available, cooperation on a larger European scale has long been limited. ELEXIS aims to harmonize these efforts and develop tools which can be used by everyone. This will reduce the cost and time needed to update existing or develop new resources. It will also help everyone work towards the same standards and increase the quality.

Digital Edition of the Vocabulário Ortográfico da Língua Portuguesa - VOLP-1940


The main objective of the project Vocabulário Ortográfico da Língua Portuguesa (VOLP-1940) is the digitization of the Vocabulário Ortográfico da Língua Portuguesa – the first orthographic vocabulary of Portuguese language with the endorsement of the Academy of Sciences of Lisbon (Academia de Ciências de Lisboa), published in 1940 by the National Press of Lisbon – and the treatment of the text according to the TEI Guidelines, with the purpose of making it available online.

European network for Web-centred linguistic data science - NexusLinguarum

COST Action CA18209


The main aim of this Action is to promote synergies across Europe between linguists, computer scientists, terminologists, and other stakeholders in industry and society, in order to investigate and extend the area of linguistic data science. We understand linguistic data science as a subfield of the emerging “data science”, which focuses on the systematic analysis and study of the structure and properties of data at a large scale, along with methods and techniques to extract new knowledge and insights from it.

MOCOLANG-O - MOdélisation COnceptuelle des troubles (du LANGage et de la communication) en Orthophonie


MOCOLANG-O aims at building and testing an operational ontological model (OWL) on concepts relating to disorders in Speech and Language Therapy (affecting language, communication and oromotor skills). It is interdisciplinary (SLT, linguistics, terminology and descriptive logic) and focuses on temporality as an entry point. This ontological resource is called TemPO (TEMporalité, Pathologie orthophonique, Ontologie).


Project financed by pôle scientifique CLCS (Connaissance, Langage, Communication, Sociétés) of the Université de Lorraine, by the laboratoire ATILF- UMR7118 and the Fédération Nationale des Orthophonistes.


The OrthoDef project aims at the increase and validation of ontological and lexical resources in the discipline of speech and language therapy (SLT), some of which were created during a previous project called MOCOLANG-O. This includes a terminological and ontological resource about concepts of pathology in SLT, a trilingual core set of diagnostic terms, and a bilingual (Fr-Eng) corpus of SLT articles collected from the ISTEX database.

COVID-19 Collaborative Glossary


The COVID-19 Collaborative Glossary comprises the terminology used by official Healthcare agencies, healthcare professionals and scientists, as well as the media and social media.

In the current context, it is essential to allow access to organized terminological information on the disease, in clear, easy to understand language. The methodology used is oriented towards the popularization of definitions, thus contributing to Health literacy. The glossary is under permanent construction. We intend to follow the evolution of the pandemic from a terminological point of view and update the resource in real time.