NAVER LABS Europe Submission to the Instruction-following Track. [PDF]
From TOWER to SPIRE: Adding the Speech Modality to a Text-Only LLM. [PDF]
mHuBERT-147: A Compact Multilingual HuBERT Model. [PDF]
Multilingual DistilWhisper: Efficient Distillation of Multi-task Speech Models via Language-Specific Experts. [PDF]
LeBenchmark 2.0: a Standardized, Replicable and Enhanced Framework for Self-supervised Representations of French Speech. [PDF]
NAVER LABS Europe’s Multilingual Speech Translation Systems for the IWSLT 2023 Low-Resource Track. [PDF]
A Study of Gender Impact in Self-supervised Models for Speech-to-Text Systems. [PDF]
Unsupervised Word Segmentation from Discrete Speech Units in Low-Resource Settings. [PDF]
Speech Resources in the Tamasheq Language. [PDF]
FINDINGS OF THE IWSLT 2022 EVALUATION CAMPAIGN. [PDF]
ON-TRAC Consortium Systems for the IWSLT 2022 Dialect and Low-resource Speech Translation Tasks. [PDF]
Promises and Limitations of Self-supervised Learning for Automatic Speech Processing. [PDF]
LeBenchmark, un référentiel d’évaluation pour le français oral. [PDF] (French only)
Modèles neuronaux pré-appris par auto-supervision sur des enregistrements de parole en français. [PDF] (French only)
Task Agnostic and Task Specific Self-Supervised Learning from Speech with LeBenchmark. [PDF]
LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from Speech. [PDF]
Investigating Alignment Interpretability for low-resource NMT. [PDF]
Investigating Language Impact in Bilingual Approaches for Computational Language Documentation. [PDF]
MaSS: A large and Clean Multilingual Corpus of Sentence-aligned Spoken Utterances Extracted from the Bible. [PDF]
ON-TRAC Consortium End-to-End Speech Translation Systems for the IWSLT 2019 Shared Task. [PDF]
How Does Language Influence Documentation Workflow? Unsupervised Word Discovery Using Translations in Multiple Languages. [PDF]
Empirical Evaluation of Sequence-to-Sequence Models for Word Discovery in Low-resource Settings. [PDF]
Unsupervised Word Segmentation From Speech With Attention. [PDF]
A small Griko-Italian speech translation corpus. [PDF]
A very low resource language speech corpus for computational language documentation experiments. [PDF]
Unwritten Languages Demand Attention Too! Word Discovery with Encoder-Decoder Models. [PDF]
Unsupervised Word Discovery Using Encoder-Decoder Models. [PDF]
Size does not matter. Frequency does. A study of features for measuring lexical complexity. [PDF]
Uma análise do perfil de entropia das estruturas sintáticas do português. [PDF]
Models and Resources for Attention-based Unsupervised Word Segmentation. [PDF]
Unsupervised Word Discovery Using Attentional Encoder-Decoder Models. [PDF]
Reviewing:
PC: LREC 2020, ACL 2020, SLTU-CCURL 2020, EMNLP 2020, EACL 2021, EMNLP 2021, ACL 2022, LREC 2022, SIGUL 2022, NAACL 2022, EACL 2022, GITT 2023, ILLC-NLP 2024, INTERSPEECH 2024, ICASSP 2024, EMT 2025 Thesis award, INTERSPEECH 2025, ARR May 2025
External Reviewer: SBAC-PAD 2018
Communications/Social Media/Website:
Website Chair: CoNLL 2019
Social Media and Communications Chair: PROPOR 2018
Internal Communications Chair: ACL2022
Publicity and Social Media Chair: EACL2026
Organization:
Local Organization Comittee - LTT 2018, TALN2022, RÉCITAL 2022
Task organizer: IWSLT 2022 (low-resource track).
Co-chair RÉCITAL 2022, SASB 2023, SASB 2024