TaRSila

 Reconhecimento Automático de Fala e Síntese de Fala no Centro de ia

Tarefa de Anotação para o Reconhecimento e Síntese de fala da Língua Portuguesa

l

The project TaRSila aims at growing speech datasets for Brazilian Portuguese language, looking to achieve state-of-the-art results for the following tasks: 

(a) automatic speech recognition (ASR) that automatically transcribes speech;

(b) multi-speaker synthesis (TTS) that generates several voices from different speakers;  

(c) speaker identification/verification that selects a speaker from a set of predefined members (speakers seen during the training of the models --- called closed-set sceneario --- or in open-set scenario in which the verification occurs with speakers not seen during the training of the models); and 

(d) voice cloning that usess a few minute/second voice dataset to train a voice model with synthesis methods, which can read any text in the target voice.

In TaRSila,  we manually validated speech datasets of academic projects such as:  (i) Nurc-Recife (OLIVEIRA JR, 2016); (ii) SP 2010 (MENDES, 2013); (iii) ALIP (GONÇALVES, 2019); and (iv) C-ORAL Brasil (RASO & MELLO, 2012). 

A collection of 365 hours of the Museu da Pessoa (MuPe) life-stories was processed to be be part of our large corpus CORAA (COrpus de Aúdios Anotados) and NURC-SP  Audio Corpus  was also processed for the purpose of training ASR models. See details of all the datasets created on CORAA Versions.

Regarding the tools, we aim to investigate recent deep learning methods for training robust ASR and TTS models for Portuguese. 

The project also foresees applications in semantic search from speech transcriptions, as well as sentiment analysis and automatic organization of speech datasets into topics.

This project is part of the Natural Language Processing initiative (NLP2) of the Center for Artificial Intelligence (C4AI) of the University of São Paulo, sponsored by IBM and FAPESP (grant #2019/07665-4). The center is part of the FAPESP Engineering Research Centers Program and is committed to state-of-the-art research in Artificial Intelligence, exploring both foundational issues and applied research.  See also the NLP2 web portal !

Related Publications

Published & Accept for Publication


Extended Abstracts, TCCs  & Technical Reports:




Submitted:

CORAA  (CORpus de Aúdios Anotados)

A large multi-purpose corpus of Brazilian Portuguese audio files aligned with transcriptions and manually validated for the purpose of training ASR and TTS models and also Sentiment Analysis using acoustic audio features.

MuPe Life Stories

One of the first applications of TaRSila project will be the automatic transcription of life stories using Automatic Speech Recognition (ASR). This is a result of a partnership between Tarsila researches, CEIA/UFG and MuPe non-governmental organization. A large number of original MuPe stories are captured in video and audio, therefore, TaRSila ASR is planned to be used in transcription generation, simplifying the process of searching this large story database.

C4AI Collaborators 

References

Oliviera Jr., M. (2016). NURC Digital Um protocolo para a digitalização, anotação, arquivamento e disseminação do material do Projeto da Norma Urbana Linguística Culta (NURC). CHIMERA: Revista De Corpus De Lenguas Romances Y Estudios Lingüísticos, 3(2), 149–174. Recuperado a partir de https://revistas.uam.es/chimera/article/view/6519.

MENDES, R.B. (2013) Projeto SP2010: Amostra da fala paulistana. Disponível em <http://projetosp2010.fflch.usp.br>. Acesso em 06/June/2021.

Gonçalves, S. C. L. Projeto ALIP (Amostra Linguística do Interior Paulista) e banco de dados Iboruna: 10 anos de contribuição com a descrição do português brasileiro. ESTUDOS LINGUÍSTICOS (SÃO PAULO. 1978), v. 48, p. 276-297, 2019.

RASO, T. ; MELLO, H. . The C-ORAL-BRASIL I: Reference Corpus for Informal Spoken Brazilian Portugues. Lecture Notes on Artificial Intelligence, v. 7243, p. 362-368, 2012.

TaRSila logo was designed by Paula Marin de Oliveira and Bruno Baldissera Carlotto. 

CORAA logo was designed by Paulo Matheus Silva Oliveira