Monday 19th August 2019
With the rising of neural machine translation, more and more challenge test sets have been developed in order to better understand particular strengths and weaknesses of MT systems.
The traditional "natural" test sets have uneven distribution of different specific (linguistic) phenomena and therefore are not suitable for getting insight into some particular phenomena. Using specified test sets for evaluation in the recent years enabled better understanding of certain aspects, and since 2018 it has become a part of the translation shared task at the WMT conference ("Additional Test Suites in News Translation Task''). Nevertheless, there is still a number of untested phenomena and language pairs, and creating a new test set for a desired application is far from trivial and still faces a number of practical challenges.
The goal of this tutorial is to provide a deep overview of the development of such test sets, including practical aspects and challenges. The first part will cover the motivation and the brief history of challenge test sets. The second part will give an overview of distinct test sets developed in the recent years and phenomena they cover. The third part will deal with practical challenges which are likely to be faced with when developing a new challenge test set.
Outline:
Here is the description of our tutorial (citation available soon in ACL anthology ).
Here are our slides.
Here is a list of all the available CTS we have been able to find so far (with links). If you have any other not listed here, let us know and we will add it!