Domain Adaptation for Social Localisation-based SMT: A Case Study Using Trommons Platform

Jinhua Du, Andy Way, Asanka Wasala and Reinhard Schaler

Abstract:

Social localisation is a kind of community action, which matches communities and the content they need, and supports their localisation efforts. The goal of social localisation-based statistical machine translation (SL-SMT) is to support and bridge global communities exchanging any type of digital content across different languages and cultures. Trommons is an open platform maintained by The Rosetta Foundation to connect non-profit translation projects and organisations with the skills and interests of volunteer translators, where they can translate, post-edit or proofread different types of documents. Using Trommons as the experimental platform, this paper focuses on domain adaptation techniques to augment SL-SMT to facilitate translators/post-editors, i.e. the Cross Entropy Difference (CED) algorithm is used to adapt Europarl data to the social localisation data. Experimental results on English–Spanish language pair show that the domain adaptation techniques can significantly improve translation performance by 6.82 absolute BLEU points and 5.99 absolute TER points compared to the baseline.