Publications
For an updated list of my publications, please see my Google Scholar.
For an updated list of my publications, please see my Google Scholar.
L. Della Libera, F. Paissan, C. Subakan, M. Ravanelli, "Focalcodec: Low-bitrate speech coding via focal modulation networks", In proceedings of NeurIPS 2025, [pdf], [code], [pretrained model], [demo].
P. Mousavi, G. Maimon, A. Moumen, D. Petermann, J. Shi, H. Wu, H.i Yang, A. Kuznetsova, A. Ploujnikov(*), R. Marxer, B. Ramabhadran, B. Elizalde, L. Lugosch, J. Li, C. Subakan, P. Woodland, M. Kim, H-Y. Lee, S.i Watanabe, Y. Adi, M. Ravanelli, “Discrete Audio Tokens: More Than a Survey!”, Transactions on Machine Learning Research (TMLR), 2025, [pdf]
L. Della Libera, J. Andreoli, D. Dalle Pezze, M. Ravanelli, G. Antonio Susto, "Bayesian Deep Learning for Remaining Useful Life Estimation via Stein Variational Gradient Descent", IEEE Transactions on Automation Science and Engineering, 2025, [pdf].
D. Borra, E. Magosso, M. Ravanelli, "A protocol for trustworthy EEG decoding with neural networks Engineering and Applications", Neural Networks, Vol 182, 2025, [pdf].
S. Zaiem, Y. Kemiche, T. Parcollet, S. Essid, M. Ravanelli, "Speech self-supervised representations benchmarking: A case for larger probing heads", Computer Speech & Language, Volume 89, January 2025, [pdf].
E. Mancini, F. Paissan, M. Ravanelli, C. Subakan, "LMAC-TD: Producing Time Domain Explanations for Audio Classifiers", In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), [pdf].
Y. Wang, P. Mousavi, A. Ploujnikov, M. Ravanelli, "What Are They Doing? Joint Audio-Speech Co-Reasoning", In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025, [pdf].
P. Plantinga, B. Cordelle, D. Louër, M. Ravanaelli, D. Klein, "Does Language Matter for Early Detection of Parkinson's Disease from Speech?", In Proceedings of the IEEE 35th International Workshop on Machine Learning for Signal Processing (MLSP), 2025, [pdf].
F. Öncel, E. Penaloza, H. Wu, S. Gupta, M. Ravanelli, L. Charlin, "Audio Prototypical Network for Controllable Music Recommendation", In Proceedings of the IEEE 35th International Workshop on Machine Learning for Signal Processing (MLSP), 2025, [pdf].
P. Mousavi, S. Gupta, C. Subakan, M. Ravanelli, "Listen: Learning soft token embeddings for neural audio LLMs", In Proceedings of Interspeech, 2025, [pdf].
Y. Wang, A. Alhmoud, S. Alsahly, M. Alqurishi, M. Ravanelli, "Calm-Whisper: Reduce Whisper Hallucination On Non-Speech By Calming Crazy Heads Down", In Proceedings of Interspeech, 2025, [pdf].
E. Mancini, F. Paissan, P. Torroni, M. Ravanelli, C. Subakan, "Investigating the Effectiveness of Explainability Methods in Parkinson's Detection from Speech", ICASSP 2025 (SPADE Workshop). [pdf].
M. Ravanelli, T. Parcollet, A. Moumen, S. de Langen, C. Subakan, P. Plantinga, Y. Wang, P. Mousavi, L. Della Libera, A. Ploujnikov, F. Paissan, D. Borra, S. Zaiem, Z. Zhao, S. Zhang, G. Karakasidis, S.-L. Yeh, P. Champion, A. Rouhe, R. Braun, F. Mai, J. Zuluaga-Gomez, S. M. Mousavi, A. Nautsch, H. Nguyen, X. Liu, S. Sagar, J. Duret, S. Mdhaffar, G. Laperrière, M. Rouvier, R. De Mori, Y. Estève, "Open-Source Conversational AI with SpeechBrain 1.0", Journal of Machine Learning Research, vol. 25, no. 333, pp. 1-11, 2024. [pdf] [code]
G. A. D’Inverno, S. Brugiapaglia, M. Ravanelli, "Generalization Limits of Graph Neural Networks in Identity Effects Learning", Neural Networks, 2025. [pdf].