Dimitar Shterionov: Neural Quality Estimation as a Bridge for Human-Computer Translation Symbiosis

Abstract:

Quality estimation (QE) of machine translation (MT), the task of predicting the quality of an MT output without human references, is particularly suitable in dynamic translation workflows, where translations need to be assessed continuously with no specific reference provided. In typical use-cases QE can provide an indication about whether to accept or post-edit a translation. In such workflows it is not only important how good the QE is but also its efficiency as well as the confidence in its predictions. This talk will cover different QE approaches and assess their applicability in an industry-established workflow. It will present their differences, benefits and drawbacks as well as the implementation and/or integration efforts required.

QE can be seen as a similarity measure between source and translated texts. Neural networks have been effectively used to convert source and translation in multi-dimensional representations that would facilitate such measurement. Recent advances in pretrained word-embedding models have aided various NLP tasks including QE. We will discuss models such as word2vec, bert, ELMo, and XLNet and how they can be used to boost the performance of QE.

Bio:

Dimitar Shterionov is a computer scientist with expertise in design and development of languages and systems for the Artificial Intelligence domain, with focus on Machine Translation (MT) and Natural Language Processing (NLP). Dimitar has obtained a PhD in computer science engineering from KULeuven in 2015 on the topic of Probabilistic Logic and Learning. After that he moved to industry and joined the team of KantanMT to work on their cloud-based customisable statistical machine translation system. In 2016, he was appointed head of research at KantanMT. His team developed the first cloud-based, customisable and publically available neural machine translation solution, released in early 2017.

In November 2017, Dimitar joined the ADAPT team of Prof Andy Way as a post-doctoral researcher to work on industry-oriented MT and NLP projects.

In his academic career Dimitar has published over 25 papers in national and international conference proceedings and journals; he has completed 3 large-scale and 5 smaller projects.

In the last 2 years, Dimitar’s research at ADAPT at Dublin City University, has been focused on Automatic Post-editing, Quality Estimation, MT for low-resource languages, Lexical richness of MT and Cross-lingual Information Retrieval. He also actively participates in cross-field collaborations (e.g. in the medical, arts and legal domains).