GenSEC Challenge at IEEE SLT 2024

Text-based Generative Speech Error Correction with LLMs


What is Generative Speech Error Correction (GenSEC)?

GenSEC Task 1 Description

LLM for Post-ASR Correction

This task focuses on mapping from n-best Hypotheses to ground truth speech Transcription (H2T). The training set includes AM scores from different pre-trained end-to-end ASR models and n-best hypotheses. The participants are allowed to use embedding from the first-pass acoustic or speech model to make the second pass model become multi-modal for hypotheses reranking or direct ground truth mapping. This challenge aims to open a connection to second-pass large language model (LLM) based rewriting for the speech community.  


GenSEC Task 2 Description

Post-ASR Speaker Tagging Correction

Task 2 Baseline:
Baseline, Rules and Submission Guideline:


Access Task-2 Dataset on HuggingFace:

Task 2 track only provides development set and evaluation set.


The result files `err_dev.hyp.seglst.json` and `err_eval.hyp.seglst.json` are automatically evaluated and added to the leaderboard.
Use your organization name and system names. You can submit multiple trials.

Technical Papers

Please submit a challenge submission paper through [CMT system]. Minimum 2 page - Max 6 page is allowed.
For templates, detailed requirements, please visit

June 20, 2024 : Paper submission deadline
June 27, 2024:  Paper update deadline

The transcripts of multiple multi-speaker datasets are anonymized and altered to construct the dev and eval set of Track-2.

GenSEC Task 3 Description 

You can use additional datasets to train your model, but they must NOT include IEMOCAP (except for the portion we provide to you). This is because we use part of IEMOCAP as our evaluation data. If additional datasets are used, you need to clearly mention the datasets used in the paper.

System Paper Deadline in the official IEEE SLT proceeding

Paper Submission (System and Method)

June 20, 2024
(CMT Link)

Paper Update (PDF revision)

June 27, 2024

Paper Notification (Potential Revision for Evaluation )

August 30, 2024

Organizing Chair and Committee 

Technical Committee  

Related References

Speech Emotion Recognition with ASR Transcripts: A Comprehensive Study on Word Error Rate and Fusion Techniques

Task 1. ASR-LM Correction

Multi-task LM for post-ASR and post-Translation correction

Task 2. Speaker Tagging Correction

Post-ASR Speaker Tagging Correction


Post-ASR LLM-Based Speech Emotion Recognition

Come to Join US in SLT 2024