Advances in speech and voice technologies are creating new opportunities to support healthcare research and practice. For example, these technologies are being used to support clinical documentation and healthcare workflows through ambient voice technologies, enable voice banking and voice preservation for people at risk of losing their voice, and measure and monitor health through speech analysis.
Despite this growing interest, developing and deploying speech technologies in healthcare presents unique challenges. Systems must be designed for real clinical contexts, requiring collaboration across disciplines and careful consideration of how speech is captured, processed, and utilised within healthcare environments. Researchers and developers must also address practical requirements such as regulation, governance, and integration into healthcare systems.
This tutorial offers an overview of the emerging landscape of speech technologies in healthcare and introduces practical methods for developing and evaluating speech-enabled health applications. It will combine expert-led talks with a practical session exploring feature extraction, modelling approaches, and bias analysis using real datasets.
The tutorial targets PhD students, researchers new to speech and health, and experienced researchers seeking an overview of current developments. It aims to promote cross-disciplinary understanding and the development of robust, clinically relevant speech technologies.
Dr. Nicholas (Nick) Cummins is a Senior Lecturer in Speech Analysis and Responsible AI in Health at King’s College London. He leads the Voice and Speech Processing for Health group, whose research focuses on integrating speech technologies into clinical research and practice. In addition to his academic work, he consults for the private sector, bridging research and real-world applications. Nick holds a PhD in Electrical Engineering from UNSW Australia and has held postdoctoral roles in Germany. His work combines speech processing, machine learning, and patient involvement to create reliable speech-based tools to support health innovation. He is the co-lead Area Chair for "Speech and Language Processing for Health" at Interspeech 2026 and a steering committee member of the UK and Ireland Speech (UKIS) organisation.
Dr. Ning Ma is a Senior Lecturer in Medical Computing, jointly appointed by University of Sheffield and Sheffield Teaching Hospitals NHS Foundation Trust. His research focuses on speech and hearing technologies, machine learning, and healthcare, with a particular emphasis on analysing sounds such as speech, breathing, snoring, and coughing, for early detection and monitoring of disease. Ning is Principal Investigator on several UKRI-funded projects and has authored more than 70 peer-reviewed journal and conference papers. He served on the Technical Programme Committee for ISCA INTERSPEECH 2024 and 2025, including as the Lead Area Chair for Speech, Voice, and Hearing Disorders. He is also Co-Director for Healthcare Data and AI at the Insigneo Institute and Sheffield Theme Lead for Machine Learning at the N8 Centre of Excellence in Computationally Intensive Research.
Dr. Melanie Jouaiti is an assistant professor in Computer Science and an IDAI affiliate at the University of Birmingham, UK. She previously was a research associate at Imperial College, cross appointed with the UK Dementia research institute, and before that at the University of Waterloo, Canada. Her work has always focused on assistive and healthcare technologies, and she’s now mostly interested in speech in neurodegeneration. She is is the co-lead Area Chair for "Speech and Language Processing for Health" at Interspeech 2026, and part of the steering committee of the UK and Ireland Speech (UKIS) community.
The tutorial will be held on the 24th of June at the KCL Denmark Hill Campus.
Address: Social Genetic and Developmental Psychiatry Centre, Memory Lane, London SE5 8AF