Speech and Language Biomarker Network

Join our network to engage in seminars and discussions that advance discovery and collaboration around cutting-edge research in speech and language health signals

Upcoming Event

July 06, 2026 · 11:00 AM (Eastern Time)

Dr. Theodora Chaspari, PhD

Associate Professor

Institute of Cognitive Science Department of Computer Science University of Colorado Boulder

USA

Dr. Chaspari has received her Ph.D. (2017) in Electrical Engineering from the University of Southern California (USC) and was an Assistant Professor at Texas A&M University (2017-2023). Theodora’s research interests lie in human-centered machine learning, affective computing, and biomedical health informatics. Papers co-authored with her students have been nominated and won awards at the ACM BuildSys 2019, IEEE ACII 2019, ASCE i3CE 2019, and IEEE BSN 2018 conferences. She is a recipient of the 2021 NSF CAREER Award (2021) and is serving as an Editor of the Elsevier Computer Speech & Language and the IEEE Transactions on Affective Computing. Her work is supported by federal and private funding sources, including the NSF, NIH, NASA, IARPA, AFOSR, General Motors, and the Engineering Information Foundation.

Toward Trustworthy Speech-Based AI Models for Mental Healthcare Assessment and Decision Support

Speech is one of the richest and most accessible behavioral signals of mental health. Recent advances in artificial intelligence (AI) and speech and language technologies have created new opportunities to automatically analyze both what people say and how they say it, enabling scalable, objective, and continuous assessment of mental health beyond traditional clinical settings. This talk will discuss socio-technical challenges of deploying speech-based AI in real-world mental healthcare, including model interpretability, privacy of sensitive speech data, fairness across various populations, and the responsible translation of AI technologies into clinical practice. It will further examine how speech-based AI can support applications such as mental health screening, monitoring symptom progression, and evaluating the quality of behavioral interventions, while emphasizing the importance of designing systems that complement rather than replace clinicians' expertise. Together, these advances illustrate how trustworthy, speech-based AI can enhance mental healthcare by providing objective, interpretable, and clinically meaningful insights while keeping humans at the center of the decision-making process.

Meetings are held on Zoom for 60 minutes, about once a month—typically on Mondays at 8 AM PT / 11 AM ET / 4 PM London / 5 PM Berlin / 11 PM Beijing.

Join for zoom link / Suggest speakers

Join our mailing list to recieve updates and zoom link📙!

For questions, suggesting speakers, or proposing papers for a journal club, reach out to Jingyao Wu 📧 and Ahmed Yousef 📧

About

This network was co-founded in 2022 by Daniel Low, Tanya Talkar, Daryush Mehta, Satrajit Ghosh and Tom Quatieri as the Harvard-MIT Speech and Language Biomarker Interest Group.

It seeks to bring together researchers and students from around the world to share novel research, receive feedback, discuss papers, and kickstart collaborations.

Organization committee

Jingyao Wu, PhD, MIT
Ahmed Yousef, PhD, MGH & Harvard Medical School
Daniel Low, PhD, Child Mind Institute & Harvard University
Fabio Catania, PhD, MIT
Nick Cummins, PhD, King's College London
Hamzeh Ghasemzadeh, PhD, University of Central Florida
Rahul Brito, Harvard & MIT
Tanya Talkar, PhD, Linus Health
Daryush Mehta, PhD, MGH & Harvard Medical School
Satrajit Ghosh, PhD, MIT McGovern Institute for Brain Research
Thomas Quatieri, PhD, MIT Lincoln Laboratory

Resources

Curated materials (tools, datasets, readings) to support exploration and innovation in the field.

Tools

Audio

senselab: a Python package that simplifies building pipelines for digital biometric analysis on speech and voice.
Riverst: a multimodal avatar for interacting with the user(s) and collect audio and video data.

Text

LLMs: https://github.com/google/langextract
Quick spacy type metrics: https://github.com/HLasse/TextDescriptives and https://github.com/novoic/blabla
Suicide Risk Lexicon, build lexicon with LLMs, and semantic similarity: https://github.com/danielmlow/construct-tracker

Datasets

Audio and text

- Many voice and speech datasets: Alden Blatter, Hortense Gallois, Samantha Salvi Cruz, Yael Bensoussan, Bridge2AI Voice Consortium, Maria Powell, Jean-Christophe Bélisle-Pipon. (2025). “Global Voice Datasets Repository Map.” Voice Data Governance. https://map.b2ai-voice.org/.
- Bridge2AI Voice Dataset https://b2ai-voice.org/the-b2ai-voice-database/
- Facebook's large-scale multimodal dataset of 4,000+ hours of human interactions for AI research: https://github.com/facebookresearch/seamless_interaction

Audio

CLAC: A Speech Corpus of Healthy English Speakers
Many speech datasets: https://github.com/jim-schwoebel/allie/tree/master/datasets#speech-datasets
Many audio visual datasets: https://github.com/krantiparida/awesome-audio-visual#datasets

Text

Many text datasets: https://lit.eecs.umich.edu/downloads.html#undefined
Many text datasets: https://github.com/niderhoff/nlp-datasets

Readings: reviews and primers

Audio

Ramanarayanan, V., Lammert, A. C., Rowe, H. P., Quatieri, T. F., & Green, J. R. (2022). Speech as a biomarker: Opportunities, interpretability, and challenges. Perspectives of the ASHA Special Interest Groups, 7(1), 276-283.
Low, D. M., Bentley, K. H., & Ghosh, S. S. (2020). Automated assessment of psychiatric disorders using speech: A systematic review. Laryngoscope investigative otolaryngology.
Cummins, N., Scherer, S., Krajewski, J., Schnieder, S., Epps, J., & Quatieri, T. F. (2015). A review of depression and suicide risk assessment using speech analysis. Speech communication. link
Patel, R. R., Awan, S. N., Barkmeier-Kraemer, J., Courey, M., Deliyski, D., Eadie, T., ... & Hillman, R. (2018). Recommended protocols for instrumental assessment of voice: American Speech-Language-Hearing Association expert panel to develop a protocol for instrumental assessment of vocal function. American journal of speech-language pathology, 27(3), 887-905.

Text

Mihalcea, R., Biester, L., Boyd, R. L., Jin, Z., Perez-Rosas, V., Wilson, S., & Pennebaker, J. W. (2024). How developments in natural language processing help us in understanding human behaviour. Nature Human Behaviour, 8(10), 1877-1889.
Stade, E. C., Stirman, S. W., Ungar, L. H., Boland, C. L., Schwartz, H. A., Yaden, D. B., ... & Eichstaedt, J. C. (2024). Large language models could change the future of behavioral healthcare: a proposal for responsible development and evaluation. NPJ Mental Health Research, 3(1), 12.
Low, D., Mair, P., Nock, M., & Ghosh, S. Text Psychometrics: Assessing Psychological Constructs in Text Using Natural Language Processing. PsyArxiv. link

Tutorials

Audio

senselab's tutorials for speech (pre-)processing

Text

Text classification with LLMs

Prior talks and meetings

Most recordings are available upon request.

2026

23-Mar-2026 | From Clinical Speech to Language Learning: Mispronunciation Detection Across Domains |Beena Ahmed (University of New South Wales)

26-Jan-2026 | Speech and Audio Intelligence for Health: Sensing, Reasoning, and Prediction | Ting Dang (The University of Melbourne, Australia)

2025

15-Dec-2025 | Generating and investigating laryngeal biosignals| Andreas Kist (Friedrich-Alexander-Universität Erlangen-Nürnberg)

10-Nov-2025 | Speech as a modality for the characterization and adaptation of neurodiversity| Mark Hasegawa-Johnson (University of Illinois Urbana-Champaign)

06-Oct-2025 | From Noise to Signal: Individual Variability in Voice Fatigue Subtyping| Mark Berardi (University of Iowa)

19-May-2025 | Building your research team: Who should be in the room where it happens? | Maria Powell (Vanderbilt University Medical Center)

5-May-2025 | Clinical theory and dimensions of speech markers: Psychosis as a case study | Lena Palaniyappan (Professor of Psychiatry, McGill)

7-Apr-2025 | Toward generalizable machine learning models in speech, language, and hearing sciences: Estimating sample size and reducing overfitting | Hamzeh Ghasemzade (Massachusetts General Hospital – Harvard Medical School)

17-Mar-2025 | Exploring the Mechanistic Role of Cognition in the Relationship between Major Depressive Disorder and Acoustic Features of Speech | Lauren White (King’s College London)

3-Mar-2025 | Speech as a Biomarker for Disease Detection | Catarina Botelho (INESC-ID, University of Lisbon)

17-Feb-2025 | Exploring Intraspeaker Variability in Vocal Hyperfunction Through Spatiotemporal Indices of RFF | Jenny Vojtech (Boston University)

20-Jan-2025 | Clinically meaningful speech-based endpoints in clinical trials | Julie Liss (Arizona State University)

2024

12-03-2024 | Revealing Confounding Biases: A Novel Benchmarking Approach for Aggregate-Level Performance Metrics in Health Assessments | Roseline Polle (Thymia)

10-16-2023 | Estimation of parameters of the phonatory system from voice | Zhaoyan Zhang (UCLA Head and Neck Surgery)

18-Nov-2024 | The interplay between signal processing and AI to archive enhanced and trustworthy interaction systems | Ingo Siegert (Otto-von-Guericke-University Magdeburg)

4-Nov-2024 | Remote Voice Monitoring System for Patients with Heart Failure | Fan Wu (ETH Zurich)

6-May-2024 | Building Speech-Based Affective Computing Solutions by Leveraging the Production and Perception of Human Emotions | Carlos Busso (UT Dallas)

1-Apr-2024 | Parkinson's speech | Godino-Llorente

4-Mar-2024 | Modelling individual and cross-cultural variation in the mapping of emotions to speech prosody | Pol van Rijn (Max Planck Institute for Empirical Aesthetics)

5-Feb-2024 | Speech Analysis for Intent and Session Quality Assessment in Motivational Interviews | Mohammad Soleymani (USC)

2023

20-Nov-2023 | High speed videoendoscopy | Maryam Naghibolhosseini (Michigan State University)

18-Sep-2023 | Democratizing speaker diarization with pyannote | Hervé Bredin (Institut de Recherche en Informatique de Toulouse) and Marvin Lavechin (Meta AI, ENS)

7-Aug-2023 | Overview of Zero-Shot Multi-speaker TTS Systems | Edresson Casanova (Coqui)

17-Jul-2023 | Considerations for Identifying Biomarkers of Spoken Language Outcomes for Neurodevelopmental Conditions | Karen Chenausky (Harvard Medical School & Massachusetts General Hospital)

15-May-2023 | The Potential of smartphones voice recordings to monitor depression severity | Nicholas Cummins (King's College London)

1-May-2023 | Reading group session: Introductory overview of self-supervised learning, transformers, and attention | Daniel Low (Harvard & MIT)

20-Mar-2023 | Accuracy of Acoustic Measures of Voice via Telepractice Videoconferencing Platforms | Hasini Weerathunge (Boston University)

6-Mar-2023 | Casual discussion on audio quality control and preprocessing | Daniel Low (Harvard & MIT)

26-Jan-2023 | Developing speech-based clinical analytics models that generalize: Why is it so hard and what can we do about it? | Visar Berisha (Arizona State University)

12-Jan-2023 | Using knockoffs for controlled predictive biomarker identification | Kostas Sechidis (Novartis)

2022

15-Dec-2022 | Provide ideas and feedback on the protocol for a large-scale data collection effort (N=5k) on mental health and voice from the NIH Bridge2AI | Daniel Low (Harvard & MIT)

1-Dec-2022 | Inferring neuropsychiatric conditions from language: how specific are transformers and traditional ML pipelines in a multi-class setting? | Lasse Hansen (Aarhus University) & Roberta Rocca (Aarhus University)

17-Nov-2022 | Meet and greet/intros |

3-Nov-2022 | Speech and Voice-based Detection of Mental and Neurological Disorders: Traditional vs Deep Representation and Explainability | Bjorn Schuller (Imperial College London)

20-Oct-2022 | What do machines hear? Overview of deep learning approaches for representing voice | Gasser Elbanna (EPFL & MIT)

Page updated

Report abuse