Key publications & projects include:
Discriminating Form and Meaning in Multilingual Models (text) — EMNLP 2025 [pdf]
Multimodal grounding for multilingual speech learning (speech + vision) — manuscript under review [preprint]
PhD work on bilingual learning simulations and emergent structure from unsupervised speech input [pdf] (see chapter 4)
Key publications & projects include:
Toward Machine Interpreting — EMNLP 2025 [pdf]
Speech evaluation taxonomy — manuscript under review [preprint]
Prosody evaluation benchmarks (ProsAudit, EmphAssess) [pdf ProsAudit] [pdf EmphAssess]
Committee member, Zero Speech Challenge 2021
Below is an overview of earlier projects that laid the foundation for my current research on multilingual learning and human-aligned evaluation.
The approach is based on self-supervised models which learn language based on raw speech. The STELA framework also allows the possibility to generate comparable developmental learning curve at the phonetic and lexical level.
More info on the project coming soon.
Layout on how learning simulations (like STELA) and infants compare.
ProsAudit has been integrated to the Zero Resource Speech Challenge language modeling track.
EmphAssess focuses on the transfer of emphasis in Speech-to-Speech models.
From there stems another question: what is language similarity? Can models capture it automatically? And what kind of typology will be captured?
I presented a paper at Speech Prosody 2022 where we did a pilot study at capturing language typology using i-vectors. I am also looking at the effect of language similarity in modelling various speech-related cognitive processes (language discrimination and separation, language familiarity effect, language learning...)
ZeroSpeech 2021 is a challenge aimed at Spoken Language Modelling from raw speech. This task consists in learning language models directly from raw audio in an unknown language, without any annotation or text.
For more info, check out the website (the challenge is still open for new submissions!).