Speech Translation Task

The Speech Translation Task addresses the task of translating audio input into a different language. While traditional methods combine automatic recognition and speech translation, recently there is an increaing intreset in end-to-end model. In this evaluation, we provide an environment to develop and compre these techniques. This year we will concentrated on the translation of TED talks and university lecture from English to German. There are several changes in this years task:

  • End-to-End Evaluation: Participants will only be provided with the audio in English and should submit the translation in German. For partipicants, who want to concentrate on one component, we provide a baseline model they can use. Participants who want to get individual scores for the ASR component can optional in addition submit the ASR transcript in CTM format

  • End-to-End Model: We will mark submissions that use an end-to-end architecture to compare this performance against the traditional pipeline models

  • Data: For all compoments of the model participants may only use data listed on this page or on the following page.

  • Baseline Model: We provided a baseline model for speech translation using a pipeline architecture as a docker container.

  • Long Evaluation Periode: To allow participants flexible plaining of there experiments, the test data will be available from beginning of June and has to be sumitted by September, ??.

Language directions:

  • English->German

ASR Task Submission Guidelines

Input Language

    • German

    • English

Development Data

English:

    • Preliminary development data is the 2015 TED test set: tst2015.en.tgz

    • More realistic lecture data will be made available soon

German:

Evaluation Data

Two evaluation sets will be made available: The test2017 set and the test2016 set which servers as progressive test set. Submitting results on both sets is mandatory!

    • will be made available in time

Submission Guidelines

ASR Run Submission Format:

  • Each participant has to submit at least one run for each of the tasks s/he registered for.

  • Multiple run submissions are allowed, but participants must explicitly indicate one PRIMARY run for each track. All other run submissions are treated as CONTRASTIVE runs. In case that none of the runs is marked as PRIMARY, the latest submission (according to the file time-stamp) for the respective track will be used as the PRIMARY run.

  • Runs have to be submitted as a gzipped TAR archive (format see below) and sent as an email attachment to cettolo@fbk.eu and jan.niehues@kit.edu and sebastian.stueker@kit.edu

  • Submissions have to be made in CTM format. See the ctm documentation in the NIST SCTK documentation for details. The confidence values are optional. The channel number has to be '1'. Scoring will be case-insensitive. Submissions have to be in UTF-8.

Output Conventions

  • The text will be scored case-insensitive, but can be submitted case-sensitive

  • Numbers, dates etc. need to be transcribed in words as they are spoken, not in digits

  • Common acronyms such as NATO, EU, are written as one word, without any special markers between the letters. This applies no matter whether they are spoken as one word or spelled out as a letter sequence

  • All other letter spelling sequences are written as individual letters with space inbetween

  • Standard abbreviations, such as "etc." "Mr." are accepted as specified by the glm file in the scoring package

  • For words pronounced in their contracted form, the orthography for the contracted form may be used. These cases will be normalized by the glmfile to their canonical form.

TAR archive file structure:

< UserID >/< Set >.< Task >.< UserID >.primary.ctm

/< Set >.< Task >.< UserID >.contrastive1.ctm

/< Set >.< Task >.< UserID >.contrastive2.ctm

/...

where:

< UserID > = user ID of participant used to download data files

< Set > = tst2017|tst2015

< Task > = ASR_ENG | ASR_GER

Examples:

fbk/tst2017.ASR_ENG.fbk.primary.ctm

/tst2017.ASR_ENG.fbk.contrastive1.ctm

SLT Task

Language directions:

  • English->German

  • German->English

Development Data (for reference translation see the MT track):

Training Data:

  • All training data that is allowed for the Multi-lingual TED MT tasks is also allowed for this task. See here for a list of all allowed data

Evaluation Data:

Submission Guidelines

SLT Run Submission Format:

  • Multiple run submissions are allowed, but participants must explicitly indicate one PRIMARY run for each track. All other run submissions are treated as CONTRASTIVE runs. In the case that none of the runs is marked as PRIMARY, the latest submission (according to the file time-stamp) for the respective track will be used as the PRIMARY run.

  • Submissions have to be submitted as a gzipped TAR archive (format see below) and sent as an email attachment to cettolo@fbk.eu and jan.niehues@kit.edu and sebastian.stueker@kit.edu.

  • Each run has to be stored in SGML format or plain text file with one sentence per line

  • Scoring will be case-sensitive and including the punctuation. Submissions have to be in UTF-8.

TAR archive file structure:

< UserID >/< Set >.< Task >.< UserID >.primary.xml

/< Set >.< Task >.< UserID >.contrastive1.xml

/< Set >.< Task >.< UserID >.contrastive2.xml

/...

where:

< UserID > = user ID of participant used to download data files

< Set > = IWSLT16.TED.tst2016 | IWSLT16.TEDX.tst2016

<Task> = SLT_<fromLID>-<toLID>

<fromLID>, <toLID> = Language identifiers (LIDs) as given by ISO 639-1 codes; see for example the WIT3 webpage

.

Examples:

kit/IWSLT.TED.tst2016.SLT_en-de.kit.primary.xml

/IWSLT.TEDX.tst2016.SLT_de-en.kit.primary.xml

Re-submitting your runs is allowed as long as the mails arrive BEFORE the submission deadline. If multiple TAR archives are submitted by the same participant, only runs of the most recent submission mail will be used for the IWSLT 2017 evaluation and previous mails will be ignored.