Call For Papers

[SE&R 2022 @ PROPOR 2022 - deadline extension]


*The submission deadline for SE&R 2022 has been extended to January 30th.*

Do you work with Speech Processing in Portuguese?

We invite everyone to explore two important tasks for this language: (1) Automatic Speech Recognition (ASR) for Spontaneous and Prepared Speech, and (2) Speech Emotion Recognition (SER).

The S&ER 2022 Workshop is collocated with the 15th edition of the International Conference on the Computational Processing of Portuguese (PROPOR 2022) and introduces two versions of a new dataset called CORAA (Corpus of Annotated Audios) built in the TaRSila project, an effort of the Center for Artificial Intelligence (C4AI).

https://sites.google.com/view/ser2022/home


The CORAA ASR dataset version 1.1 is composed of 290.79 hours of transcribed audios from spontaneous and prepared speech. It includes European Portuguese (2.68 hours) and Brazilian Portuguese (the remaining hours). The spontaneous speech part of this dataset contains four main Brazilian accents (São Paulo State Cities, Minas Gerais, Recife, São Paulo Capital) and the prepared speech part includes speakers from many different regions of Brazil and Portugal.

In the SE&R 2022 ASR TASK, participants train their own models on the resources made available specifically for the challenge (closed data) and can also use other open resources as well. Any publicly available additional data or pre-trained models are permitted. In the submission, participants will be asked to inform whether their models use open data or closed data.


We provide a strong baseline, consisting of a pre-trained version of the Wav2Vec 2.0 model and strongly recommend that participants use this model for transfer learning.

Each participant can submit up to four models and can assign these models to four subtasks to guide the evaluation:

- Mixed (all datasets)

- Prepared Speech PT_BR (TEDx Portuguese)

- Prepared Speech PT_PT (TEDx Portuguese)

- Spontaneous Speech (ALIP, C-Oral Brasil, SP2010, NURC-RE*)

* (also contains prepared speech audios/transcriptions)

The results will be released by subtask ranked by the metric CER (Character Error Rate), although WER (Word Error Rate) will also be reported. In particular, results for models participating in the Spontaneous subtask will be reported by accent.

The CORAA SER version 1.0 is composed of approximately 50 minutes of audio segments labeled in three classes: neutral, non-neutral female, and non-neutral male. While the neutral class represents audio segments with no well-defined emotional state, the non-neutral classes represent audio segments associated with one of the primary emotional states in the speaker's speech. This dataset was built from the C-ORAL-BRASIL I corpus. The available corpus consists of audio segments representing Brazilian Portuguese informal spontaneous speech. The non-neutral emotion class was labeled considering paralinguistic elements (laughing, crying, etc). Participants can use pre-trained models and external data, as long as the original C-ORAL-BRASIL corpus (or variants) is not used for model training.

In the SE&R 2022 SER TASK, participants must train their own models using acoustic audio features. A training set is released. We provide two baseline models. The first baseline uses a set of prosodic audio features for emotion classification. In the second baseline, we use the Wav2Vec model to extract features (i.e. audio embeddings) from the audio segments. These features can be used for training a speech emotion recognition classifier.

Each participant can submit up to three models. The Macro F1 Score measure will be used to evaluate the models.

Here are the important dates participants should keep in mind:

* October 30, 2021: release of shared-tasks data (training and dev sets) & Registration Starts

* December 15, 2021 to January 24, 2022 January 30, 2022: evaluation period; release of test-data to registered participants (extended)

* January 24, 2022 January 30, 2022: paper submission deadline (extended)

* February 1st, 2022: paper revision starts

* February 21, 2022: notification of acceptance or rejection

* March 7, 2022: camera ready deadline

* March 21, 2022: public release of full datasets (ASR & SER) on Propor 2022 beginning

* March 21, 2022: SE&R Workshop

To get started:

- register on the site of the S&ER 2022 Workshop:

https://sites.google.com/view/ser2022/shared-tasks

- download the competition data and code on the same site:

https://sites.google.com/view/ser2022/shared-tasks

Best regards,

SE&R 2022 organizers:

Alessandra Alaniz Macedo, FFCLRP/USP, Brazil (Website Chair)

Arnaldo Candido Jr., UTFPR, Brazil (Program Chair & Conference Chair)

Edresson Casanova, ICMC/USP, Brazil (Evaluation Chair)

Flaviane Romani Fernandes Svartman, FFLCH/USP, Brazil (Program Chair)

José Augusto Baranauskas, FFCLRP/USP, Brazil (Publication Chair)

Marcelo Finger, IME/USP, Brazil (Conference Chair)

Ricardo M. Marcacini, ICMC/USP, Brazil (Evaluation Chair, Publication Chair & Conference Chair)

Sandra Maria Aluísio, ICMC/USP, Brazil (Website Chair)

Solange O. Rezende, ICMC/USP, Brazil (Program Chair)