Marco Turchi: Quality estimation in support of automatic post-editing

Abstract:

The findings of the last editions of the automatic post-editing (APE) shared tasks at WMT and recent publications on APE have shown that APE can be a valuable solution to improve the performance of a black-box machine translation (MT) system. However, the continuous advancements in translation quality generated by the use of more powerful neural models have posed new challenges for the APE systems. In particular, high quality translations require minimal and appropriate edits and, in general, the capability of the APE system to avoid unnecessary changes that damage the original MT output.

In this presentation, I will talk about the FBK research in investigating different strategies for combining quality estimation and automatic post-editing to improve the output of machine translation systems, The joint contribution of the two technologies is analyzed in different settings showing that APE can really benefit from the use of quality estimation, but there is still some work to do in both research areas to maximize this cooperation.

Bio:

Marco Turchi is the head of the machine translation unit at Fondazione Bruno Kessler (FBK). He received his Ph.D. degree in Computer Science from the U. of Siena, Italy in 2006. Before joining FBK in 2012, he worked at the European Commission Joint Research Centre, Italy, at the University of Bristol, at the Xerox Research Centre Europe, and at Yahoo Research Lab. His research activities focus on various aspects of sequence-to-sequence modelling applied to machine translation, speech translation and automatic post-editing. He is co-authored more than 100 scientific publications, and served as a reviewer for international journals, conferences, and workshops. He is co-organizer of the Conference of Machine Translation and of the automatic post-editing evaluation campaigns. He has been involved in several EU projects such as SMART, Matecat, and QT21. He is the recipient of the Amazon AWS ML Research Awards on the topic of end-to-end spoken language translation in rich data conditions.