DeBiasByUs

OPEN DISCUSSION

Before and during the workshop, we will be using the DeBiasByUs platform to gather real-world examples of gender bias in MT output. These examples will serve as a starting point for the discussion at the end of the workshop, with the goal of generating new insights and perspectives that can inform future work in this area.

THE DeBiasByUs PLATFORM

DeBiasByUs is a community-driven platform that serves the dual purpose of raising public awareness of the issue of gender bias in MT and of creating a database of real-word examples of gender bias in commercial MT systems. After approval, user submissions are added to a public database that is freely available to download. Gathering and examining these examples can lead to a better understanding of the various ways in which gender bias can manifest itself in translation and eventually help us explore ways to mitigate these biases.

More information on the topic of gender bias in MT, which summarizes new research, approaches and findings in the field, is presented in the learn section.

PRACTICAL INSTRUCTIONS

On the 'share' page, users can submit a case of gender bias in MT as they have found online by typing (or pasting) the source and biased target sentence, and selecting the source and target language. Optionally, users can type their preferred unbiased translation, leave a comment and, from a list provided, choose the occurred type of error (e.g., stereotyping, incorrect pronoun). Users can also select the website of which commercial MT system they were using (e.g. Google Translate).

WORKSHOP CONTRIBUTION

The concept of DeBiasByUs aligns with GITT's aim to encourage research by working collaboratively to develop solutions for addressing gender bias to promote gender inclusivity in MT. The DeBiasByUs tool will provide a foundation for further discussion during GITT.

We ask all workshop attendees to add examples to the DeBiasByUs platform prior to the conference, following the practical instructions outlined above. For a more detailed overview, the DeBiasByUs platform will be presented/introduced during the opening notes of the workshop. At this point, attendees will be encouraged to further add examples during the day, where they feel inspired to do so based on the keynotes and presentations. Data collected by DeBiasByUs, including examples submitted prior to or during the workshop and findings thereof, will serve as a starting point for the open discussion at the end of the workshop.

Based on this collaborative discussion further motivated by presentations held during the day, we aim to generate insights with new and interesting directions for further research on gender inclusivity in MT.

EXAMPLES COLLECTED

Original English source sentence: The famous lawyer was happy.

Biased French translation: Le célèbre avocat était heureux.

User comment: The English 'lawyer' is ambiguous, but the adjective 'famous' is probably more frequently found in relation to men in the training data, so MT translated the lawyer as masculine in French.

Original English source sentence: I am a Professor of Translation Studies and the Director of the Centre for Translation Studies.

Biased Spanish translation: Soy profesor de Estudios de Traducción y Director del Centro de Estudios de Traducción.

User comment: Both professor and director are seen as masculine (even though the sentence was a part of a bionote of a female professor).

Original English source sentence: The pretty model was nervous about his next shoot.

Biased Spanish translation: La guapa modelo estaba nerviosa por su próximo rodaje.

User comment: While the word ‘his’ is explicit in the source and there is only one possible referent, the profession ‘model’ is stereotypically feminine. Combined with the word ‘pretty’ (which is often assigned more to women than to men in training data), it leads to a biased translation of the model as a woman.

The image below represents an example being added to the DeBiasByUs platform. The first sentence is a gender-inclusive French source sentence, the second sentence is the Dutch biased target translation (output from a commercial MT system), and the third sentence is the user's unbiased suggestion.

Page updated

Report abuse