I am a Postdoctoral researcher at KTH Royal Institute of Technology in Sweden. I obtained my PhD degree in Computational Linguistics at McGill University in 2021. I am interested in Natural Language Processing and Speech Processing in general. My postdoc research focuses on the semantic (text) and acoustic (speech) aspects of dialogues, particularly on prediction of turn-taking.
During my PhD, I have worked on a range of topics from speech perception to tone modelling. In particular, I worked on bridging speech technologies (ASR) with human speech perception, and interpretable tone representations using large speech corpora (in submission). At the early stage of my PhD, I also worked on statistical modelling of human speech perception of multidimensional tonal systems.
We recently had two papers accepted to Findings of ACL 2023 and ICPhS 2023.
Response-conditioned Turn-taking Prediction (text-based): we built a transformer-based GPT model for turn prediction. We incorporate turn-taking prediction and response ranking as a one-stage process, which makes consistent improvement than the traditional two-stage process.
Investigating the turn-holding effects of fillers (audio-based): we used a voice activity prediction model trained with a CPC encoder to analyze the effects of fillers uh/um. We investigated the prosodic and lexical effects using Survival Analysis.