Dialogues Task

The Dialogues task addresses the text translation of dialogues, from Japanese to English.

The dialogues were between an elderly person and listener for information navigation and attentive listening, and transcribed manually.

The dialogue corpus was developed in Nara Institute of Science and Technology (NAIST) and distributed publicly for research purposes,

and English translation of the dialogues are provided by National Institute of Information and Communications Technology (NICT) for this IWSLT evaluation campaign.

In-domain datasets will be provided only for development and test.

One important challenge in this task will be in considering dialogue contexts because utterances in spoken dialogues have to be understood with their contexts such as pronouns and empty categories.

Submission Guidelines for Dialogues task

Each participant has to submit at least one run for each translation task s/he registered for.

Detokenized case sensitive automatic translations with punctuation have to be wrapped in NIST XML formatted files. NIST XML format is described in this paper (Section 8 and Appendix A); XML templates will be made available by the evaluation period.

XML files with runs have to be submitted as a gzipped TAR archive (in the format specified below) and e-mailed to sudohATisDOTnaistDOTjp

TAR archive file structure:

<UserID>/<Set>.<Task>.<UserID>.primary.xml

/<Set>.<Task>.<UserID>.contrastive1.xml

/<Set>.<Task>.<UserID>.contrastive2.xml

/...

where:

<UserID> = user ID (short name) of participant provided in the Registration Form

<Set> = IWSLT17.tst2017

<Task> = dialogues_ja-en

The PRIMARY run will be used for the official scoring; nevertheless, CONTRASTIVE runs will be evaluated as well.

Example:

naist/IWSLT17.tst2017.dialogues_ja-en.naist.primary.xml

/IWSLT17.tst2017.dialogues_ja-en.naist.contrasive1.xml

Re-submitting runs is allowed as far as the mails arrive BEFORE the submission deadline. In case that multiple TAR archives are submitted by the same participant, only runs of the most recent submission mail will be used for the IWSLT 2017 evaluation and previous mails will be ignored.