The 2017 IWSLT Evaluation focuses on three official tasks:
multilingual text translation of TED talks (including zero-shot translation)
text translation of dialogues
automatic speech recognition and speech translation of lectures
In addition, an unofficial task on the translation of TED talks from one language to another is proposed as well.
Training, development and testing data for each task are freely available to the participants.
Potential participants in the evaluation are invited to check out our Call for Participation, fill in the Registration form, and join our e-mail list.
Permissible Training Data
MT Systems and Language Models for ASR:
Training of MT systems and language models for ASR is constrained to data supplied by the organizers or listed as permissible.
Participants can use any other linguistic resource provided that it does not include or exploits these TED talks; any use of additional data with respect to that explicitly listed by the organizers has to be clearly stated in the system paper and communicated at the submission: such runs will be marked as "Unconstrained Training".
ASR Acoustic Modeling
As for ASR acoustic modeling no training data are distributed. Participants are allowed to use any publicly available data except for these TED talks.