Understanding Context Usage in Machine Translation 

Understanding Context Usage in Machine Translation

Can machine translation models use context in a human-plausible way when translating multi-sentence texts? Interpret the inner workings of these systems to find out.


Abstract


Establishing whether language models use context information in a reasonable way during generation is fundamental to ensure their safe adoption in real-world settings. Recent work showed that inspecting the internals of machine translation (MT) models can help trace a connection between specific parts of the input context and model predictions. In this project, you will extend previous analyses to identify (un)reasonable cases of context usage in MT models across various languages and see how they relate to translation mistakes identified by human annotators.

Description


Additional context is often necessary to resolve ambiguities in translation. However, MT models are often trained to translate sentence by sentence. For example, when translating English into French, the word "your" can be translated differently depending on different factors, for example, the age of the interlocutor:


The child said to the grandfather: thanks for your help => L'enfant a dit au grand-père: merci pour votre aide  [Formal => Often used for elderly people]


The grandfather said to the child: thanks for your help => Le grand-père dit à l'enfant: merci pour ton aide [Informal => Often used for younger people]


But how can we know that, in the case above, the translation model is actually using the context words 'child' and 'grandfather' during translation? The PECoRe method (Sarti et al. 2023) is based on the intuition that we can examine what happens "under the hood" of the translation model to detect when the context plays an influential role for generation. In short, the method is composed of two steps:



The paper above explored a narrow set of applications for PECoRe, focusing on English-to-French translation of gender agreement and lexical choice contextual phenomena. In this project, your core task will be to extend this analysis to other translation directions and linguistic phenomena using multilingual MT systems. More specifically, you will translate a small set of sentences augmented with preceding context in a language of your choice and use PECoRe to investigate whether the model's use of context matches your intuition.


Ideas for research directions:


Materials

References


Sarti, Gabriele et al. (2023) Quantifying the Plausibility of Context Reliance in Neural Machine Translation. ICLR 2024


Yin, Kayo and Neubig, Graham (2022) Interpreting Language Models with Contrastive Explanations. EMNLP 2022


Fernandes, Patrick et al. (2023) When Does Translation Require Context? A Data-driven, Multilingual Exploration. ACL 2023


Fernandes, Patrick et al. (2021) Measuring and Increasing Context Usage in Context-Aware Machine Translation. ACL 2021

Sarti, Gabriele et al. (2023) Inseq: An Interpretability Toolkit for Sequence Generation Models. ACL 2023.

Sarti, Gabriele et al. (2022) DivEMT: Neural Machine Translation Post-Editing Effort Across Typologically Diverse Languages. EMNLP 2022.

NLLB Team (2022) No Language Left Behind: Scaling Human-Centered Machine Translation. Arxiv Preprint.


Mohammed, Wafaa and Niculae, Vlad (2024) On Measuring Context Utilization in Document-Level MT Systems. Arxiv Preprint