Speakers & ABSTRACTS

Speakers & abstracts

The following speakers will give a presentation at the third SpeechTechday, Feb 10, 2025.


Academic speakers


Cristian Tejedor Garcia (RU)

Title: The Transformation of Language and Speech Models in Artificial Intelligence Research

Summary: Transformer-based speech and language models are revolutionizing research in Artificial Intelligence (AI). In speech, models such as Wav2Vec 2.0, Whisper, and T5 excel in automatic speech recognition (ASR), text-to-speech (TTS), and speaker identification, while in language, models such as GPT and BERT dominate tasks such as question answering (Q&A), summarization, and sentiment analysis. These models are increasingly being utilized in healthcare and education, specifically for the early diagnosis of neurodegenerative diseases in elderly populations and the detection of reading difficulties in children. By analyzing speech patterns, pauses, and language use, these models can assist in identifying markers for Parkinson’s or Alzheimer’s disease, offering non-invasive, cost-effective, and scalable tools for early intervention. 

However, a significant limitation arises regarding data privacy and the potential harms of foundational models, as models are often trained on massive datasets scraped from the internet, which can inadvertently include sensitive or biased information. This poses challenges in ensuring ethical use, avoiding harmful outputs, and maintaining compliance with data protection regulations, especially when deploying these models in sensitive or regulated research areas. This presentation is about their scalability, adaptability, and ability, making them indispensable tools for advancing both fundamental and applied research, but responsible usage is critical to mitigate these risks.

 

 

Wietse de Vries (RU Groningen)

Title: Challenges in Speech Technology for Minority Languages in the Netherlands

Short summary: Developing speech technology for minority languages is not just training models on smaller datasets. Minority languages often have more dialectal variation and do not always have a single standardized written form.  Therefore, these languages pose more challenges in addition to being low resource. Specifically for Frisian and Low Saxon language varieties, we work on collecting data together with local communities and volunteers.

Moreover we work on novel modeling techniques that work with highly diverse minority languages. Language technology should be used to help preserve local language varieties rather than forcing people to use the standard form of a majority language.

 

Zhengjun Yue (TU Delft)

Title: Challenges and recommendations for Dutch atypical speech data collection, annotation, sharing, and usage
Short summary: The quality of speech datasets is important for advancing speech technology, particularly for atypical speech, where data scarcity and variability pose significant challenges. This talk discusses the challenges encountered in collecting, annotating, sharing, and using Dutch atypical speech datasets, based on insights from several collaborative research projects TU Delft involves, including stuttered and disordered child speech, personalized dysarthric speech, and mock medical conversations. Common issues such as therapist dominance in recordings, unsuitability of clinical data for speech downstream tasks, and lack of standardized protocols are discussed alongside practical recommendations to address them.

Emphasis is placed on creating high-quality datasets for specific speech-related tasks like speech recognition, speech analysis, and assessment, with a call for collaboration and knowledge-sharing to advance research and clinical applications in Dutch speech technology.

 

Khiet Truong (TU Twente)

Title and summary follow