Challenge Test Sets for MT Evaluation

Monday 19th August 2019

With the rising of neural machine translation, more and more challenge test sets have been developed in order to better understand particular strengths and weaknesses of MT systems.

The traditional "natural" test sets have uneven distribution of different specific (linguistic) phenomena and therefore are not suitable for getting insight into some particular phenomena. Using specified test sets for evaluation in the recent years enabled better understanding of certain aspects, and since 2018 it has become a part of the translation shared task at the WMT conference ("Additional Test Suites in News Translation Task''). Nevertheless, there is still a number of untested phenomena and language pairs, and creating a new test set for a desired application is far from trivial and still faces a number of practical challenges.

The goal of this tutorial is to provide a deep overview of the development of such test sets, including practical aspects and challenges. The first part will cover the motivation and the brief history of challenge test sets. The second part will give an overview of distinct test sets developed in the recent years and phenomena they cover. The third part will deal with practical challenges which are likely to be faced with when developing a new challenge test set.

Outline:

What are challenge test sets?
Overview of various challenge test sets and the phenomena they cover
Practical aspects and challenges

Here is the description of our tutorial (citation available soon in ACL anthology ).

Challenge_Test_Sets.pdf

Here are our slides.

Tutorial_Challenge_Test_Sets_MTSummit2019.pdf

Here is a list of all the available CTS we have been able to find so far (with links). If you have any other not listed here, let us know and we will add it!

mt-list.pdf

The Presenters

Dr. Maja Popović

Maja Popović is a post-doctoral researcher at the ADAPT Centre at Dublin City University. She graduated at the Faculty of Electrical Engineering, University of Belgrade, Serbia, and continued her studies at the RWTH Aachen University, Germany, where she obtained her PhD with the thesis "Machine Translation: Statistical Approach with Additional Linguistic Knowledge". After that, she continued her research at the German Institute for Artificial Intelligence (DFKI), at the Humboldt University of Berlin, and currently at the ADAPT Centre at Dublin City University. Her research interests include machine translation, automatic and human evaluation in NLP, as well as combining linguistic knowledge and data-driven methods.She published over 60 papers in various conferences, workshops and journals, including a book chapter about machine translation evaluation. She serves as regular programme committee member of ACL/EACL/NAACL, EMNLP, COLING, LREC and other conferences and workshops for more than 10 years. She has been a co-organiser of the Workshop on The Qualities of Literary Machine Translation at MT Summit 2019 and the Workshop on Quality Assessment for Text Simplification at LREC 2016, tutorial chair at ACL 2017 and area chair at EAMT and COLING 2018. She is a guest co-editor of the special issue of Machine Translation Journal on Human Factors in Neural Machine Translation published in 2019 by Springer. She gave several invited talks and lectures about evaluation for machine translation and other NLP tasks.

Dr. Sheila Castilho

Sheila Castilho graduated in Linguistics and Education from the UNIOESTE University in Brazil. She holds a joint Master in Natural Language Processing from the University of Wolverhampton, UK and the University of Algarve, Portugal. She completed her PhD dissertation at Dublin City University in 2016. Currently, she is a post-doctoral researcher at the ADAPT Centre. She is a programme committee member of a number of translation, machine translation and NLP conferences and has acted as a reviewer for high-profile journals. She has authored several journal articles and book chapters on translation technology, post-editing of machine translation, user evaluation of machine translation, and translators’ perception of machine translation. She is a co-editor of the book 'Translation Quality Assessment: From Principles to Practice', published in 2018 by Springer and a guest co-editor of the special issue of Machine Translation Journal on Human Factors in Neural Machine Translation published in 2019 by Springer. Her research interests include machine translation, post-editing, machine and human translation evaluation, usability, and translation technologies.

Google Sites

Report abuse