Multi-speaker, Multi-lingual Indic TTS

with VOICE CLONING

LIMMITS'24

The LIMMITS ICASSP Challenge for 2025 is up at https://sites.google.com/view/limmits24/home

Challenge overview

Registration form - https://forms.gle/ch6H2vRjFaKr84V3A

Dataset - https://ee.iisc.ac.in/limmitsdataset/

Challenge submission details - https://sites.google.com/view/limmits24/challenge/challenge-submission

Target speaker reference files - https://sites.google.com/view/limmits24/dataset/few-shot-data

About LIMMITS 24 Challenge

As part of the challenge, TTS data of 80 hours is released in each of Bengali, Chhattisgarhi, English and Kannada languages. This is in addition to Telugu, Hindi and Marathi data released in the LIMMITS 23. Each language will have a male and a female speaker, resulting in TTS corpora of 7 languages and 14 speakers. TTS corpora in these languages are being built as a part of the SYSPIN project at SPIRE lab, Indian Institute of Science (IISc) Bangalore, India.

In this challenge, we present the opportunity for the participants to perform TTS Voice cloning with a multilingual base model of 14 speakers. We further extend this scenario, allowing training with more multi-speaker corpora such as VCTK, LibriTTS. Finally, we also present a scenario for zero-shot voice cloning. Towards these, we share 560 hours of studio-quality TTS data in 7 Indian languages. This includes low-resource language of Chattisgarhi. The evaluation will be performed on mono as well as cross-lingual synthesis, across data from all 7 languages, with naturalness and speaker similarity subjective tests.

About SYSPIN

SYnthesizing SPeech in INdian languages (SYSPIN) is an initiative to develop large open-source text-to-speech (TTS) corpora and models for TTS systems in nine Indian languages in the area of agriculture and finance. Nine Indian languages considered for this project are Hindi, Bengali, Marathi, Telugu, Bhojpuri, Kannada, Magadhi, Chhattisgarhi, and Maithili.

A majority of the population in the country is still unable to use all the technological services due to language and literacy constraints. SYSPIN helps to reduce their barriers to voice-based technologies and creates a potential market for tech innovators and social entrepreneurs.

The output of this project will allow local innovation in emerging markets to develop products and services serving illiterate Indians and rural poor populations in their own medium of engagement with technology. The TTS corpus will be a unique resource for developing assistive technologies for people with speech and visual disabilities. The proposed 720 hours of open-source TTS data will open up opportunities for academic and industrial research.

More about SYSPIN: https://syspin.iisc.ac.in/

Challenge Timeline

Registration opens - September 16, 2023

Dataset shared - September 16, 2023

Baseline shared - October 6, 2023

Challenge submission opens - December 1, 2023

Challenge submission closes - December 4, 2023 December 8, 2023 (11:59PM AOE)

Results announced - December 20, 2023 [Will be communicated by mail before Dec 31]

Paper submission deadline - January 2, 2024 January 9, 2024

Page updated

Google Sites

Report abuse