University of Zurich
Contextual knowledge is of central importance for machine translation to resolve ambiguities and ensure consistency in reference, formality, etc. In this talk, I discuss the capability of neural machine translation (NMT) to take into account contextual information. I first show results in word sense disambiguation showing the possibility of integrating linguistic knowledge to address this problem, but also the progress of models learning from raw text data. I then shift to the challenge of taking into account wider context beyond the current sentence. While even simple model architectures are promising for this, the relative lack of document-level parallel data, and the insensitivity of standard metrics to wider context, hamper the development of systems that go beyond the current sentence. I discuss evaluation with contrastive test sets to obtain targeted results for contextual phenomena, and the use of automatic post-editing to increase translation consistency with only monolingual data.