E. CASANOVA; A. CANDIDO JR; C. SHULBY; F. S. OLIVEIRA; J. P. TEIXEIRA; M. A. PONTI; S. M. ALUÍSIO. TTS-Portuguese Corpus: a corpus for speech synthesis in Brazilian Portuguese. Language Resources and Evaluation, 2022. (https://github.com/Edresson/TTS-Portuguese-Corpus)
E. CASANOVA; C. SHULBY; E. GÖLGE; N. M. MÜLLER; F. S. OLIVEIRA; A. CANDIDO JR; A. S. SOARES; S. M. ALUÍSIO; M. A. POINT. SC-GlowTTS: an Efficient Zero-Shot Multi-Speaker Text-To-Speech Model. In: INTERSPEECH, 2021, Brno. Interspeech 2021, ISCA, 2021. p. 3645-3649.
E. CASANOVA; A. CANDIDO JR; C. SHULBY; F. S. OLIVEIRA; L. R. S. GRIS; H. P. SILVA; S. M. ALUÍSIO; M. A. PONTI. Speech2Phone: A new and efficient method for training speaker recognition models. 2021. In: 10th Brazilian Conference on Intelligent Systems (BRACIS), 2021.
Lucas Rafael Stefanel Gris, Edresson Casanova, Frederico Santos de Oliveira, Anderson da Silva Soares, Arnaldo Candido Junior. Desenvolvimento de um modelo de reconhecimento de voz para o Português Brasileiro com Poucos Dados Utilizando o Wav2vec 2.0. In: Anais do XV Brazilian e-Science Workshop. SBC, 2021. p. 129-136.
GONZAGA, V. M.; MURRUGARRA-LLERENA, N. ; MARCACINI, R. M. Multimodal intent classification with incomplete modalities using text embedding propagation. In: Brazilian Symposium on Multimedia and Web (WebMedia), 2021 (Best Short Paper). https://doi.org/10.1145/3470482.3479636.
Gôlo, Marcos PS, Rafael G. Rossi, and Ricardo M. Marcacini. "Triple-VAE: A Triple Variational Autoencoder to Represent Events in One-Class Event Detection." In Anais do XVIII Encontro Nacional de Inteligência Artificial e Computacional, pp. 643-654. SBC, 2021. (Best Paper ENIAC 2021 - Main track) https://doi.org/10.5753/eniac.2021.18291
Souza, Mariana C. de; Bruno M. Nogueira; Rafael G. Rossi; Ricardo M. Marcacini, and Solange O. Rezende. A Heterogeneous Network-Based Positive and Unlabeled Learning Approach to Detect Fake News. In Brazilian Conference on Intelligent Systems, pp. 3-18. Springer, Cham, 2021. https://dx.doi.org/10.1007/978-3-030-91699-2_1
Souza, Mariana C, Bruno Magalhães Nogueira, Rafael Geraldeli Rossi, Ricardo Marcondes Marcacini, Brucce Neves Dos Santos, and Solange Oliveira Rezende. A network-based positive and unlabeled learning approach for fake news detection. Machine Learning (2021): 1-44. https://dx.doi.org/10.1007/s10994-021-06111-6
Mattos, Joao Pedro Rodrigues; and Ricardo M. Marcacini. Semi-Supervised Graph Attention Networks for Event Representation Learning. In 2021 IEEE International Conference on Data Mining (ICDM), pp. 1234-1239. IEEE, 2021. https://doi.org/10.1109/ICDM51629.2021.00150
Carmo, Paulo, and Ricardo Marcacini. Embedding propagation over heterogeneous event networks for link prediction. In 2021 IEEE International Conference on Big Data (Big Data), pp. 4812-4821. IEEE, 2021. https://doi.org/10.1109/BigData52589.2021.9671645
Lucas Rafael Stefanel Gris, Edresson Casanova, Frederico Oliveira, Anderson da Silva Soares and Arnaldo Candido Junior. Brazilian Portuguese Speech Recognition Using Wav2vec 2.0. In Computational Processing of the Portuguese Language - 15th International Conference, PROPOR 2022, Fortaleza, Brazil, March 21-23, 2022, Proceedings. Lecture Notes in Computer Science 13208, Springer 2022, ISBN 978-3-030-98304-8. https://dblp.org/rec/conf/propor/GrisCOSJ22
Edresson Casanova, Julian Weber, Christopher Shulby, Arnaldo Candido Junior, Eren Gölge, Moacir Antonelli Ponti YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone. Pre-print version Accept for Publication in The 39th International Conference on Machine Learning (ICML 2022).
Marcelo Matheus Gauy & Marcelo Finger. "Pretrained audio neural networks for Speech emotion recognition in Portuguese". In the Proceedings of the First Workshop on Automatic Speech Recognition for Spontaneous and Prepared Speech & Speech Emotion Recognition in Portuguese, co-located with PROPOR 2022. March 21st, 2022 (Online). Vol. 1, pp. 15-24, 2022. Link: https://sites.google.com/view/ser2022/
Caroline Alves, Bruno Carlotto, Bruno Dias, Anátale Garcia, Bruno Gianesi, Renan Izaias, Maria Luiza Morais, Paula Oliveira, Vinícius G. Santos, Rafael Sicoli, Flaviane R. Fernandes Svartman, Sandra Aluísio & Sidney Leal. "Transfer Learning and Data Augmentation Techniques applied to Speech Emotion Recognition in SE&R 2022". In the Proceedings of the First Workshop on Automatic Speech Recognition for Spontaneous and Prepared Speech & Speech Emotion Recognition in Portuguese, co-located with PROPOR 2022. March 21st, 2022 (Online). Vol. 1, pp. 25-36, 2022. Link: https://sites.google.com/view/ser2022/
Alexander Scaranti, Douglas Silva, Fernando Meloni & Alessandra Alaniz. "Speech Emotion Recognition in Portuguese for SofiaFala: SER SofiaFala". In the Proceedings of the First Workshop on Automatic Speech Recognition for Spontaneous and Prepared Speech & Speech Emotion Recognition in Portuguese, co-located with PROPOR 2022. March 21st, 2022 (Online). Vol. 1, pp. 37-41, 2022. Link: https://sites.google.com/view/ser2022/
Arnaldo Candido Junior, Edresson Casanova & Ricardo Marcacini. "Overview of the Automatic Overview of the Automatic Speech Recognition for Spontaneous and Prepared Speech & Speech Emotion Recognition in Portuguese (S&ER) Shared-tasks at PROPOR 2022". In the Proceedings of the First Workshop on Automatic Speech Recognition for Spontaneous and Prepared Speech & Speech Emotion Recognition in Portuguese, co-located with PROPOR 2022. March 21st, 2022 (Online). Vol. 1, pp. 1-8, 2022. Link: https://sites.google.com/view/ser2022/
Santos, V.G., Alves, C.A., Carlotto, B.B., Papa Dias, B.A., Stefanel Gris, L.R., Lima Izaias, R.d., Azevedo de Morais, M.L., Marin de Oliveira, P., Sicoli, R., Svartman, F.R.F., Leite, M.Q., Aluísio, S.M. (2022) CORAA NURC-SP Minimal Corpus: a manually annotated corpus of Brazilian Portuguese spontaneous speech . Proc. IberSPEECH 2022, 161-165, doi: 10.21437/IberSPEECH.2022-33
Candido Junior, A., Casanova, E., Soares, A. et al. CORAA ASR: a large corpus of spontaneous and prepared speech manually validated for speech recognition in Brazilian Portuguese. Lang Resources & Evaluation (2022). https://doi.org/10.1007/s10579-022-09621-4
GRIS, L. R. S. ; CANDIDO JUNIOR, A. ; SANTOS, V. G. ; DIAS, B. A. P. ; LEITE, M. Q. ; SVARTMAN, F. R. F. ; ALUISIO, SANDRA . Bringing NURC/SP to Digital Life: the Role of Open-source Automatic Speech Recognition Models. In: XIX ENCONTRO NACIONAL DE INTELIGÊNCIA ARTIFICIAL E COMPUTACIONAL (ENIAC), 2022, Campinas/SP. Anais do XIX Encontro Nacional de Inteligência Artificial e Computacional. Porto Alegre, Brasil: SBC, 2022. p. 330-341. https://sol.sbc.org.br/index.php/eniac/article/view/22793/22616
Casanova, E., Shulby, C., Korolev, A., Junior, A.C., Soares, A.d.S., Aluísio, S., Ponti, M.A. (2023) ASR data augmentation in low-resource settings using cross-lingual multi-speaker TTS and cross-lingual voice conversion. Proc. INTERSPEECH 2023, 1244-1248, doi: 10.21437/Interspeech.2023-496
Mendes da Silva, A.C., Silva, D.F. and Marcacini, R.M., 2022, December. Heterogeneous Graph Neural Network for Music Emotion Recognition. In 23rd International Society for Music Information Retrieval Conference (ISMIR 2022). https://archives.ismir.net/ismir2022/paper/000080.pdf
Moraes, Leonardo, Ricardo Marcondes Marcacini, and Rudinei Goularte. "Video Summarization using Text Subjectivity Classification." In Proceedings of the Brazilian Symposium on Multimedia and the Web, pp. 133-141. 2022. https://doi.org/10.1145/3539637.3556998
Toledo, G.L. and Marcacini, R.M., 2022. Transfer Learning with Joint Fine-Tuning for Multimodal Sentiment Analysis. In LatinX Workshop at The Thirty-ninth International Conference on Machine Learning (LatinX @ICML 2022). Extended Abstract Paper. https://www.youtube.com/watch?v=6iVm8jl27xI
Rodrigues A. C., Marcacini R. M. Sentence Similarity Recognition in Portuguese from Multiple Embedding Models. In2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA) 2022 Dec 12 (pp. 154-159). IEEE. https://doi.org/10.1109/ICMLA55696.2022.00029
Edresson Casanova, Vinicius G. Santos, Flaviane R. Fernandes Svartman, Marli Quadros Leite, Arnaldo Candido Jr., Ricardo M. Marcacini, Solange O. Rezende & Sandra Maria Aluisio. Recursos para o processamento de fala. In: Caseli, H.M.; Nunes, M.G.V. (org.). Processamento de Linguagem Natural: Conceitos, Técnicas e Aplicações em Português. BPLN, 2023. Disponível em: https://brasileiraspln.com/livro-pln/1a-edicao/parte2/cap3/cap3.html
Leal, S.E., Duran, M.S., Scarton, C.E., Aluisio, S.M. NILC-Metrix: assessing the complexity of written and spoken language in Brazilian Portuguese. Lang Resources & Evaluation (2023). https://doi.org/10.1007/s10579-023-09693-w
Frederico S. Oliveira, Edresson Casanova, Arnaldo Candido Junior, Anderson S. Soares & Arlindo R. Galvão Filho. CML-TTS: A Multilingual Dataset for Speech Synthesis in Low-Resource Languages. In: Text, Speech, and Dialogue (TSD 2023), 2023, Plzeň, Czechia. Text, Speech, and Dialogue. Cham: Springer Nature Switzerland, 2023. p. 188-199.
Frederico S. Oliveira, Edresson Casanova, Arnaldo Cândido Júnior, Lucas R. S. Gris, Anderson S. Soares, Arlindo R. Galvão Filho. Evaluation of Speech Representations for MOS Prediction. In: Text, Speech, and Dialogue (TSD 2023), 2023, Plzeň, Czechia. Text, Speech, and Dialogue. Cham: Springer Nature Switzerland, 2023. p. 270-282.
TOMITA, Victor Akihito Kamada; DA SILVA, Angelo Cesar Mendes; MARCACINI, Ricardo Marcondes. Cluster Fusion Training: Exploring Cluster Analysis to Enhance Cross-Domain Sentiment Classification. In: Anais do XX Encontro Nacional de Inteligência Artificial e Computacional. SBC, 2023. p. 330-344. OBS: Terceiro lugar (best paper session)
MORAES, Marcelo Isaias; MARCACINI, Ricardo Marcondes. On the Use of Aggregation Functions for Semi-Supervised Network Embedding. In: 2023 International Joint Conference on Neural Networks (IJCNN). IEEE, 2023. p. 1-8.
Edresson Casanova, Sandra Aluísio, and Moacir Antonelli Ponti. 2024. TTS applied to the generation of datasets for automatic speech recognition. In Proceedings of the 16th International Conference on Computational Processing of Portuguese (Propor 2024), pages 633–638, Santiago de Compostela, Galicia/Spain. Association for Computational Linguistics (LINK: https://aclanthology.org/2024.propor-1.73).
Giovana Meloni Craveiro, Vinicius Gonçalves Santos, Gabriel Jose Pellisser Dalalana, Flaviane R. Fernandes Svartman, and Sandra Maria Aluísio. 2024. Simple and Fast Automatic Prosodic Segmentation of Brazilian Portuguese Spontaneous Speech. In Proceedings of the 16th International Conference on Computational Processing of Portuguese (PROPOR 2024), pages 32–44, Santiago de Compostela, Galicia/Spain. Association for Computational Linguistics (LINK: https://aclanthology.org/2024.propor-1.4/).
Ana Carolina Rodrigues, Alessandra A. Macedo, Arnaldo Candido Jr, Flaviane R. F. Svartman, Giovana M. Craveiro, Marli Quadros Leite, Sandra M. Aluísio, Vinícius G. Santos, and Vinícius M. Garcia. 2024. Portal NURC-SP: Design, Development, and Speech Processing Corpora Resources to Support the Public Dissemination of Portuguese Spoken Language. In Proceedings of the 16th International Conference on Computational Processing of Portuguese (PROPOR 2024), pages 187–195, Santiago de Compostela, Galicia/Spain. Association for Computational Linguistics (LINK: https://aclanthology.org/2024.propor-1.19).
Ariadne Matos, Gustavo Araújo, Arnaldo Candido Junior, and Moacir Ponti. 2024. Accent Classification is Challenging but Pre-training Helps: a case study with novel Brazilian Portuguese datasets. In Proceedings of the 16th International Conference on Computational Processing of Portuguese (Propor 2024), pages 364–373, Santiago de Compostela, Galicia/Spain. Association for Computational Linguistics (LINK: https://aclanthology.org/2024.propor-1.37).
Craveiro, G.M., Galdino, J.C. (2025). Diversity in Data for Speech Processing in Brazilian Portuguese. In: Paes, A., Verri, F.A.N. (eds) Intelligent Systems. BRACIS 2024. Lecture Notes in Computer Science (LNCS, volume15415). Springer, Cham. https://doi.org/10.1007/978-3-031-79038-6_9
Lima, R., Leal, S.E., Junior, A.C., Aluísio, S.M. (2025). A Large Dataset of Spontaneous Speech with the Accent Spoken in São Paulo for Automatic Speech Recognition Evaluation. In: Paes, A., Verri, F.A.N. (eds) Intelligent Systems. BRACIS 2024. Lecture Notes in Computer Science (LNCS, volume 15412). Springer, Cham. https://doi.org/10.1007/978-3-031-79029-4_3. Pre-print version
ARAÚJO, Gustavo E. et al. EyetrackingMOS: Proposta de um método de avaliação online para modelos de síntese de fala. In: SIMPÓSIO BRASILEIRO DE TECNOLOGIA DA INFORMAÇÃO E DA LINGUAGEM HUMANA (STIL), 15. , 2024, Belém/PA. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2024 . p. 87-96. Link: https://sol.sbc.org.br/index.php/stil/article/view/31120.
GALDINO, Julio Cesar; ARAÚJO, Gustavo Evangelista; CANDIDO JUNIOR, Arnaldo; OLIVEIRA JR., Miguel; PONTI, Moacir Antonelli; ALUÍSIO, Sandra Maria. Acoustic Analysis of Prosodic Features in Natural versus Synthesized Speech Samples from YourTTS and SYNTACC Models. In: ENCONTRO NACIONAL DE INTELIGÊNCIA ARTIFICIAL E COMPUTACIONAL (ENIAC), 21. , 2024, Belém/PA. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2024. p. 304-315. ISSN 2763-9061. DOI: https://doi.org/10.5753/eniac.2024.245092
Sidney Evaldo Leal, Arnaldo Candido Junior, Ricardo Marcacini, Edresson Casanova, Odilon Gonçalves, Anderson Silva Soares, Rodrigo Freitas Lima, Lucas Rafael Stefanel Gris, and Sandra Aluísio. 2025. MuPe Life Stories Dataset: Spontaneous Speech in Brazilian Portuguese with a Case Study Evaluation on ASR Bias against Speakers Groups and Topic Modeling. In Proceedings of the 31st International Conference on Computational Linguistics, pages 6076–6087, Abu Dhabi, UAE. Association for Computational Linguistics (Coling 2025).
Lucas Gris, Ricardo Marcacini, Arnaldo Candido Junior, Edresson Casanova, Anderson Soares, Sandra Maria Aluísio. Evaluating OpenAI's Whisper ASR for Punctuation Prediction and Topic Modeling of life histories of the Museum of the Person. In: Prosodic Interfaces: Interdisciplinary Perspectives on Sound Patterns and Human Interaction, De Gruyter, p. 247-279, 2025. https://doi.org/10.1515/9783111060309-009
Oliveira, Jr., Miguel. Prosodic Interfaces: Interdisciplinary Perspectives on Sound Patterns and Human Interaction, De Gruyter, 330 p., 2025.
Julio Cesar Galdino, Ariadne Nascimento Matos, Flaviane Romani Fernandes Svartman. The evaluation of prosody in speech synthesis: a systematic review. Journal of the Brazilian Computer Society, 2025, 31:1, doi: 10.5753/jbcs.2025.XXXX. (Aceito para publicação em maio/2025).
Galdino, J., Leal, S., de Souza, L., Lima, R., Moreira, A., Candido Jr., A., Oliveira Jr., M., Casanova, E., Aluísio, S. (2025). The Impact of Prosodic Segmentation on Speech Synthesis of Spontaneous Speech. To appear in BRACIS 2025.
Giovana Meloni Craveiro, Caroline Adriane Alves, Flaviane Svartman, Sandra M. Aluísio. Machine Learning Classifiers with Acoustic Features for Prosodic Segmentation in Brazilian Portuguese: A Comprehensive Evaluation. To appear in STIL 2025.