Join our network to engage in seminars and discussions that advance discovery and collaboration around cutting-edge research in speech and language health signals
January 26, 2026 · 11:00 AM (Eastern Time)
Dr. Ting Dang, PhD
Senior Lecturer in Digital Innovation
School of Computing and Information Systems
The University of Melbourne, Australia
Dr. Ting Dang is a Senior Lecturer in the School of Computing and Information Systems at the University of Melbourne. She previously held research positions at Nokia Bell Labs, the University of Cambridge, and the University of New South Wales (UNSW), where she also completed her PhD. Her research centers on human-centered AI for health, speech and audio processing, affective computing, and wearable sensing. She has served as an Associate Editor for IEEE Transactions on Affective Computing, IEEE Pervasive Computing and Elsevier Computer Speech and Language, as an Area Chair for ICASSP and INTERSPEECH, and as a Senior Program Committee member for AAAI. She has played key roles in organizing international conferences and workshops, including INTERSPEECH, MobiSys, ICASSP, UbiComp, and ACII. Her work has received multiple Best/Top Paper Awards at leading conferences and has attracted media attention, including coverage by the BBC. She was honored with the Rising Star Award in STEM, Women in Color in 2025. Her research has led to industry prototypes and patents, driving innovation in both academia and industry.
Speech and Audio Intelligence for Health: Sensing, Reasoning, and Prediction
Speech and audio signals are rich biomarkers for health monitoring and disease prediction. In this talk, I will explore the intersection of Speech AI, especially the latest advances in Generative AI, and their transformative potential in health analytics. The first part of the talk will delve into one of the most nuanced facets of human communication: emotion. I will present our recent research on ambiguity-aware emotion recognition, which harnesses the power of Large Speech-Language Models (LALMs) to interpret subtle emotional cues in speech. Rather than reducing human emotions to simplified categories, these models embrace the inherent complexity and variability of our expressions. I will also discuss novel techniques to enhance the reasoning abilities of LALMs at inference time, enabling richer, contextually aware understanding through the integration of prior knowledge.
The second part will shift to real-world applications, specifically the integration of speech and voice analysis in mobile and wearable devices for continuous health monitoring. I will showcase approaches for collecting and designing speech and audio data pipelines from personal devices, as well as advanced modeling techniques for forecasting health trends. These methods extend the reach of health monitoring beyond clinical settings, supporting the daily assessment of conditions such as respiratory diseases and the tracking of physiological parameters like heart rate and hearing screening.
Meetings are held on Zoom for 60 minutes, about once a month—typically on Mondays at 8 AM PT / 11 AM ET / 4 PM London / 5 PM Berlin / 11 PM Beijing.
Join our mailing list to recieve updates and zoom link📙!
For questions, suggesting speakers, or proposing papers for a journal club, reach out to Jingyao Wu 📧 and Ahmed Yousef 📧
This network was co-founded in 2022 by Daniel Low, Tanya Talkar, Daryush Mehta, Satrajit Ghosh and Tom Quatieri as the Harvard-MIT Speech and Language Biomarker Interest Group.
It seeks to bring together researchers and students from around the world to share novel research, receive feedback, discuss papers, and kickstart collaborations.
Jingyao Wu, PhD, MIT
Ahmed Yousef, PhD, MGH & Harvard Medical School
Daniel Low, PhD, Child Mind Institute & Harvard University
Fabio Catania, PhD, MIT
Nick Cummins, PhD, King's College London
Hamzeh Ghasemzadeh, PhD, University of Central Florida
Rahul Brito, Harvard & MIT
Tanya Talkar, PhD, Linus Health
Daryush Mehta, PhD, MGH & Harvard Medical School
Satrajit Ghosh, PhD, MIT McGovern Institute for Brain Research
Thomas Quatieri, PhD, MIT Lincoln Laboratory
Curated materials (tools, datasets, readings) to support exploration and innovation in the field.
Audio
senselab: a Python package that simplifies building pipelines for digital biometric analysis on speech and voice.
Riverst: a multimodal avatar for interacting with the user(s) and collect audio and video data.
Text
Quick spacy type metrics: https://github.com/HLasse/TextDescriptives and https://github.com/novoic/blabla
Suicide Risk Lexicon, build lexicon with LLMs, and semantic similarity: https://github.com/danielmlow/construct-tracker
Audio and text
Many voice and speech datasets: Alden Blatter, Hortense Gallois, Samantha Salvi Cruz, Yael Bensoussan, Bridge2AI Voice Consortium, Maria Powell, Jean-Christophe Bélisle-Pipon. (2025). “Global Voice Datasets Repository Map.” Voice Data Governance. https://map.b2ai-voice.org/.
Bridge2AI Voice Dataset https://b2ai-voice.org/the-b2ai-voice-database/
Facebook's large-scale multimodal dataset of 4,000+ hours of human interactions for AI research: https://github.com/facebookresearch/seamless_interaction
Audio
CLAC: A Speech Corpus of Healthy English Speakers
Many speech datasets: https://github.com/jim-schwoebel/allie/tree/master/datasets#speech-datasets
Many audio visual datasets: https://github.com/krantiparida/awesome-audio-visual#datasets
Text
Many text datasets: https://lit.eecs.umich.edu/downloads.html#undefined
Many text datasets: https://github.com/niderhoff/nlp-datasets
Audio
Ramanarayanan, V., Lammert, A. C., Rowe, H. P., Quatieri, T. F., & Green, J. R. (2022). Speech as a biomarker: Opportunities, interpretability, and challenges. Perspectives of the ASHA Special Interest Groups, 7(1), 276-283.
Low, D. M., Bentley, K. H., & Ghosh, S. S. (2020). Automated assessment of psychiatric disorders using speech: A systematic review. Laryngoscope investigative otolaryngology.
Cummins, N., Scherer, S., Krajewski, J., Schnieder, S., Epps, J., & Quatieri, T. F. (2015). A review of depression and suicide risk assessment using speech analysis. Speech communication. link
Patel, R. R., Awan, S. N., Barkmeier-Kraemer, J., Courey, M., Deliyski, D., Eadie, T., ... & Hillman, R. (2018). Recommended protocols for instrumental assessment of voice: American Speech-Language-Hearing Association expert panel to develop a protocol for instrumental assessment of vocal function. American journal of speech-language pathology, 27(3), 887-905.
Text
Mihalcea, R., Biester, L., Boyd, R. L., Jin, Z., Perez-Rosas, V., Wilson, S., & Pennebaker, J. W. (2024). How developments in natural language processing help us in understanding human behaviour. Nature Human Behaviour, 8(10), 1877-1889.
Stade, E. C., Stirman, S. W., Ungar, L. H., Boland, C. L., Schwartz, H. A., Yaden, D. B., ... & Eichstaedt, J. C. (2024). Large language models could change the future of behavioral healthcare: a proposal for responsible development and evaluation. NPJ Mental Health Research, 3(1), 12.
Low, D., Mair, P., Nock, M., & Ghosh, S. Text Psychometrics: Assessing Psychological Constructs in Text Using Natural Language Processing. PsyArxiv. link
Most recordings are available upon request.
15-December-2025 | Generating and investigating laryngeal biosignals| Andreas Kist (Friedrich-Alexander-Universität Erlangen-Nürnberg)
10-November-2025 | Speech as a modality for the characterization and adaptation of neurodiversity| Mark Hasegawa-Johnson (University of Illinois Urbana-Champaign)
06-October-2025 | From Noise to Signal: Individual Variability in Voice Fatigue Subtyping| Mark Berardi (University of Iowa)
19-May-2025 | Building your research team: Who should be in the room where it happens? | Maria Powell (Vanderbilt University Medical Center)
5-May-2025 | Clinical theory and dimensions of speech markers: Psychosis as a case study | Lena Palaniyappan (Professor of Psychiatry, McGill)
7-Apr-2025 | Toward generalizable machine learning models in speech, language, and hearing sciences: Estimating sample size and reducing overfitting | Hamzeh Ghasemzade (Massachusetts General Hospital – Harvard Medical School)
17-Mar-2025 | Exploring the Mechanistic Role of Cognition in the Relationship between Major Depressive Disorder and Acoustic Features of Speech | Lauren White (King’s College London)
3-Mar-2025 | Speech as a Biomarker for Disease Detection | Catarina Botelho (INESC-ID, University of Lisbon)
17-Feb-2025 | Exploring Intraspeaker Variability in Vocal Hyperfunction Through Spatiotemporal Indices of RFF | Jenny Vojtech (Boston University)
20-Jan-2025 | Clinically meaningful speech-based endpoints in clinical trials | Julie Liss (Arizona State University)
12-03-2024 | Revealing Confounding Biases: A Novel Benchmarking Approach for Aggregate-Level Performance Metrics in Health Assessments | Roseline Polle (Thymia)
10-16-2023 | Estimation of parameters of the phonatory system from voice | Zhaoyan Zhang (UCLA Head and Neck Surgery)
18-Nov-2024 | The interplay between signal processing and AI to archive enhanced and trustworthy interaction systems | Ingo Siegert (Otto-von-Guericke-University Magdeburg)
4-Nov-2024 | Remote Voice Monitoring System for Patients with Heart Failure | Fan Wu (ETH Zurich)
6-May-2024 | Building Speech-Based Affective Computing Solutions by Leveraging the Production and Perception of Human Emotions | Carlos Busso (UT Dallas)
1-Apr-2024 | Parkinson's speech | Godino-Llorente
4-Mar-2024 | Modelling individual and cross-cultural variation in the mapping of emotions to speech prosody | Pol van Rijn (Max Planck Institute for Empirical Aesthetics)
5-Feb-2024 | Speech Analysis for Intent and Session Quality Assessment in Motivational Interviews | Mohammad Soleymani (USC)
20-Nov-2023 | High speed videoendoscopy | Maryam Naghibolhosseini (Michigan State University)
18-Sep-2023 | Democratizing speaker diarization with pyannote | Hervé Bredin (Institut de Recherche en Informatique de Toulouse) and Marvin Lavechin (Meta AI, ENS)
7-Aug-2023 | Overview of Zero-Shot Multi-speaker TTS Systems | Edresson Casanova (Coqui)
17-Jul-2023 | Considerations for Identifying Biomarkers of Spoken Language Outcomes for Neurodevelopmental Conditions | Karen Chenausky (Harvard Medical School & Massachusetts General Hospital)
15-May-2023 | The Potential of smartphones voice recordings to monitor depression severity | Nicholas Cummins (King's College London)
1-May-2023 | Reading group session: Introductory overview of self-supervised learning, transformers, and attention | Daniel Low (Harvard & MIT)
20-Mar-2023 | Accuracy of Acoustic Measures of Voice via Telepractice Videoconferencing Platforms | Hasini Weerathunge (Boston University)
6-Mar-2023 | Casual discussion on audio quality control and preprocessing | Daniel Low (Harvard & MIT)
26-Jan-2023 | Developing speech-based clinical analytics models that generalize: Why is it so hard and what can we do about it? | Visar Berisha (Arizona State University)
12-Jan-2023 | Using knockoffs for controlled predictive biomarker identification | Kostas Sechidis (Novartis)
15-Dec-2022 | Provide ideas and feedback on the protocol for a large-scale data collection effort (N=5k) on mental health and voice from the NIH Bridge2AI | Daniel Low (Harvard & MIT)
1-Dec-2022 | Inferring neuropsychiatric conditions from language: how specific are transformers and traditional ML pipelines in a multi-class setting? | Lasse Hansen (Aarhus University) & Roberta Rocca (Aarhus University)
17-Nov-2022 | Meet and greet/intros |
3-Nov-2022 | Speech and Voice-based Detection of Mental and Neurological Disorders: Traditional vs Deep Representation and Explainability | Bjorn Schuller (Imperial College London)
20-Oct-2022 | What do machines hear? Overview of deep learning approaches for representing voice | Gasser Elbanna (EPFL & MIT)