I am a tenure-track Assistant Professor at the College of Information Science at the University of Arizona.
I earned my Ph.D. in Electrical and Computer Engineering and B.S. in Electrical Engineering from the University of Illinois Urbana-Champaign. I was fortunate to be advised by Professor Mark Hasegawa-Johnson. Following my Ph.D, I also spent time visiting the WAV Lab in the Language Technologies Institute at Carnegie Mellon University, under the supervision of Professor Shinji Watanabe. During my Ph.D., I interned at Meta AI Research and Amazon Web Services.
My research goal is to build interdisciplinary speech applications that support the early identification of developmental disorders in children, such as autism, delayed speech maturity, and language disorders. During my doctoral study, I focused on developing emerging AI-powered clinical applications for early childhood (<4 years old), with an emphasis on several core tasks: speaker diarization (identify “who spoke when”), vocalization classification (identify the type of vocalizations given a speaker) between infant/toddlers and adults, and phoneme recognition of toddlers. These tasks have been developed for different applications and social contexts, including daylong home recordings and clinical settings, as shown by the representative work done below:
Monitoring infant psychological development
Li et al. "Analysis of acoustic and voice quality features for the classification of infant and mother vocalizations." Speech communication 133 (2021): 41-61.
Li, et al. Towards Robust Family-Infant Audio Analysis Based on Unsupervised Pretraining of Wav2vec 2.0 on Large-Scale Unlabeled Family Audio. Proc. Interspeech 2023, 1035-1039, doi: 10.21437/Interspeech.2023-460
Identifying children at risk of autism / assessing speech maturity of young children
Li, et al. Enhancing Child Vocalization Classification with Phonetically-Tuned Embeddings for Assisting Autism Diagnosis. Proc. Interspeech 2024, 5163-5167, doi: 10.21437/Interspeech.2024-540
Phoneme recognition of toddlers
Li, et al. "Analysis of Self-Supervised Speech Models on Children’s Speech and Infant Vocalizations," 2024 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW), Seoul, Korea, Republic of, 2024, pp. 550-554, doi: 10.1109/ICASSPW62465.2024.10626416.s
Review of methods and challenges in analysis of naturalistic recordings in early childhood
Li, et al. "Automated Analysis of Naturalistic Recordings in Early Childhood: Applications, Challenges, and Opportunities", IEEE Signal Processing Magazine, 2025
Looking ahead, I aim to improve the performance of core tasks and explore other novel AI applications in healthcare and education for children's speech processing.
Academic family tree: me – Hasegawa-Johnson – Stevens – Beranek – Hunt – Chaffee – Pierce – Macfarlane – Tait – Hopkins – Sedgwick – Jones – Postlethwaite – Whisson – Taylor – Smith – Cotes – Newton.