Speech Translation

Task Description

The Speech Translation Task addresses the translation of English audio into German text. Traditionally, a pipeline approach using an automatic speech recognition system and a machine translation system was used. Recently, also end-to-end neural models were used to solve this tasks. In contrast to last, this years edition will have several changes:

End-to-End Evaluation: This year not the single models of the traditional pipeline will be evaluated, but only the end-to-end performance. Every participant has to generate German translation based on the English audio. Thereby, they might use a traditional baseline of different components as well as an end-to-end model. For participants, who want to focus on one component of the pipeline, we provide baseline components for the other parts. If participants wish, also the English transcript in CTM format will be evaluated.
Baseline model: We provide a baseline implementation of the traditional pipeline as a Docker container.
End-to-End Models: This evaluation should be used to compare end-to-end speech translation models and the translation pipeline approach. Therefore, end-to-end models will be evaluated as a special evaluation condition. Furthermore, we provide an aligned TED corpus of English audio and German text.
Flexible evaluation schedule: The test data will be available for several months, enabling a flexible evaluation schedule for each participant.

Evaluation Conditions

End-to-End Condition: Only submissions using a single model for the whole task are allowed.
Baseline Condition: Submission using separate components for ASR and MT are allowed.

Allowed Training Data

Provided data
- TED corpus
- Speech-Translation TED corpus
  - In addition, for this corpus we provided 40-dimension Filterbank features from the audio, extracted by XNMT (download)
Additional data
- TED LIUM corpus (except the talks in this list, including the selected monolingual data on the website)
- Data provided by WMT 2018
- OpenSubtitles2018 (Link updated to the 2018 version!!!)

Development and Evaluation Data

The development and evaluation data is provided as an archive with the follwing files ($set e.g. IWSLT.TED.dev2010):
- $set.en-de.en.xml: Reference transcript (will not be provided for evaluation data)
- $set.en-de.en.xml: Reference translation (will not be provided for evaluation data)
- CTM_LIST: Ordered file list containing the ASR Output CTM Files (will not be provided for evaluation data) (Generated by ASR systems that use more data)
- FILE_ORDER: Ordered file list containing the wav files
- $set.yaml: This file containts the time steps for sentence-like segments. It is generated by the LIUM Speaker Diarization tool.
- $set.h5: This file contains the 40-dimensional Filterbank features for each sentence-like segment of the test data created by XNMT.
- The last two files are created by the following command:
  - python -m xnmt.xnmt_run_experiments /opt/SLT.KIT/scripts/xnmt/config.las-pyramidal-preproc.yaml
Development data:
- (Please note that system generated the provided ASR scripts use more training data than allowed for this years evaluations)
- dev2010
- tst2010
- tst2013
- tst2014
- tst2015
Evaluation data:
- tst2018

Submission Guidelines

Multiple run submissions are allowed, but participants must explicitly indicate one PRIMARY run for each track. All other run submissions are treated as CONTRASTIVE runs. In the case that none of the runs is marked as PRIMARY, the latest submission (according to the file time-stamp) for the respective track will be used as the PRIMARY run.
Submissions have to be submitted as a gzipped TAR archive (format see below) and sent as an email attachment to jan.niehues@kit.edu and sebastian.stueker@kit.edu.
Each run has to be stored in SGML format or plain text file with one sentence per line
Scoring will be case-sensitive and including the punctuation. Submissions have to be in UTF-8.

TAR archive file structure:

< UserID >/< Set >.< Task >.< UserID >.primary.xml

/< Set >.< Task >.< UserID >.contrastive1.xml

/< Set >.< Task >.< UserID >.contrastive2.xml

/...

where:

< UserID > = user ID of participant used to download data files

< Set > = IWSLT18.SLT.tst2018

<fromLID>, <toLID> = Language identifiers (LIDs) as given by ISO 639-1 codes; see for example the WIT3 webpage

Examples:

kit/IWSLT18.SLT.tst2018.en-de.kit.primary.xml

/IWSLT18.TED.tst2018.en-de.kit.contrastive0.xml

Re-submitting your runs is allowed as long as the mails arrive BEFORE the submission deadline. If multiple TAR archives are submitted by the same participant, only runs of the most recent submission mail will be used for the IWSLT 2018 evaluation and previous mails will be ignored.

Schedule

Data available: June, 2018

Test data available: July, 2018

Translation submission deadline: August, 31st, 2018

Task coordinators:

Jan Niehues

Sebastian Stüker

Google Sites

Report abuse