Translating the Zooniverse: support for non-English languages on the world's largest platform for crowdsourced research

Samantha Blickhan, Adler Planetarium

Delilah Clement, Adler Planetarium

L. Clifton Johnson, Adler Planetarium

James E. O'Donnell, University of Oxford

Overview

Zooniverse is the world’s largest platform for online, crowdsourced research. We provide a space for research teams to build and run projects that invite volunteers to help process data to aid in their research efforts. The general ethos of Zooniverse is that you don’t have to be a subject matter expert to take part—anyone can contribute to real research. Since the platform launched in 2009, over 2.6 million registered volunteers from 200+ countries have collectively produced hundreds of millions of classifications across 300+ projects. 

According to a 2021 participant survey (Jackson et al. 2022), most Zooniverse volunteers are based in either the United States (40%) or the United Kingdom (25%). The majority of survey respondents identified as residents of the Global North, with the highest representation from the Global South from India (2%). The remaining 25% of survey respondents were located in over 200 countries.

The majority representation of volunteers from the US and UK is in line with the fact that most Zooniverse research teams are based in the US and UK, as are the Zooniverse team (our core development and leadership teams are based at the Adler Planetarium in Chicago, Oxford University, and the University of Minnesota-Twin Cities). The location demographics in the 2021 survey results are consistent with the last major volunteer survey, carried out in 2014 (Simpson et al. 2014). 

Teams create Zooniverse projects via the Project Builder, a browser-based tool that lets anyone create a crowdsourcing project, free of charge. The ability to translate a project is currently available via the Project Builder user interface (UI). The majority of projects using the translations feature only add one non-English translation, though some project teams choose to translate their project into many languages in order to provide support for multilingual volunteer communities, and/or reflect multilingual datasets. The size of the volunteer community can also be a motivating factor, as well as staff time and resources.  

Lack of translation in digital spaces is often a reflection of team demographics, as well as what Nilsson-Fernàndez and Dombrowski (2022) refer to as the "monolingual-Anglophone obliviousness with regard to language." The motivation to translate a crowdsourcing project can be a reflection of the dataset and/or the geographic location of the project team. Translation can be led by the research team leading the project, or it can be a community-led initiative by current project volunteers or potential volunteers who are unable to participate due to language barriers. 

Horvath (2021) suggests that "raising awareness of the peculiarities and challenges that those dealing with non-English texts and non-Latin scripts in a digital context regularly face is key to the development of this area." Our motivation for sharing this work is not only an effort to raise awareness of the availability of this feature, but also to solicit feedback from the broader Digital Humanities (DH) community and hold ourselves accountable to increasing support for translations in the coming years.

History

Project translations on Zooniverse have been available since 2013, when the Zooniverse team initially created support for versioned translations management in response to community-led translation initiatives, in which volunteers would reach out directly to research teams and offer to translate projects. The community effort was led by long-time Zooniverse volunteers including Zuzana Ročkaiová (Zooniverse username @yshish), and typically took place on private project message boards and via DM and email. 

Before the launch of the Project Builder in 2015, all Zooniverse projects were custom built, with ~7 new projects launching per year, on average. The Project Builder made crowdsourcing on Zooniverse a much more directly accessible option for teams who lacked the resources for custom development efforts. As a result, project launches dramatically increased, with dozens of projects launching per year. In 2023 alone, 23 projects have launched as of this poster's publication. 

Even when only a handful of projects launched per year, it was difficult for a team of our size to manage custom front-end translations without a user interface for research team members to input translations. The launch of the Project Builder meant that some research teams never had any contact with our development team (as opposed to all projects collaborating with us directly); because of this, we now needed to include a means to translate projects created via the Project Builder.

Translating Project Builder projects

Zooniverse translations have been available in the Project Builder since 2017. At present, teams who wish to use our Translations feature must email contact@zooniverse.org to request that this feature be enabled for their project. Once Translations are enabled for a project, teams can add a Translator via the 'Collaborators' tab of the Project Builder. The Translator can then use the Translations UI to input translations for English-language content. A detailed walkthrough of this process is available below.

Translation Workflow

The Zooniverse platform has two types of translations: 1) static translation dictionaries for platform-level text; and 2) project-specific translations. Translated projects must have an English version in order to go through the Project Review process and be promoted at zooniverse.org/projects. When a translated Zooniverse webpage is being viewed in a language other than English, the currently-viewed language is stored as a subpath in the url, which makes it easy for volunteers to share translated project pages. For example, a French url for the Planet Hunters: TESS project looks like this: https://www.zooniverse.org/projects/fr/nora-dot-eisner/planet-hunters-tess. 

Static translations

Static translations refer to platform-level text; i.e. the text shared across all projects such as menu labels, headers, etc. This content is controlled by the Zooniverse team. 

To support platform-level translation dictionaries, we use a translations management service called Lokalise, which granted us a free account because we are an open-source platform with a team based at non-profit institutions. Lokalise does not manage live translations, but instead is used to make pull requests to GitHub for approval by the Zooniverse frontend dev team. 

Volunteer translators can be given a Lokalise account to assist with inputting new static translations. Detailed instructions are available via a wiki. After a translator has submitted new translations, the change is reviewed by a member of the Zooniverse frontend dev team, who can then share a preview of the Zooniverse website with the translator to review before it goes live. Static translations for 19 languages are currently in progress via Lokalise. 

The main challenge of providing platform-level translations is that the process is labor intensive. Translation only needs to happen once for the language to be added to the platform, but maintaining translations means that any time we want to make changes to the platform-level English text, we also have to update all of the static translations. 

Project translations

Project translations refer to the content created and input by external research teams, e.g. Project Builder users. This content is controlled by the project team. Teams must first input English text into their project via the Project Builder. Then they will need to add the non-English language(s) into which they want their project translated. 

Translators can access the translations user interface via the Translations tab of the Project Builder. Once there, they can view a list of translations, preview the translated version of the project, edit translations, and publish translations specific to their Zooniverse project.

Translators can also access the translations UI by visiting translations.zooniverse.org and logging in with their Zooniverse credentials. They will first need to choose from a list of translated projects for which their account is a project owner or collaborator. 

Once they've selected a project, they can select a language from a list of languages that have been added to the project. They will see the English version of a section of text on the screen, and can simultaneously input translations for that section. If the English text is updated after the translation has already been entered, that section will become highlighted in red, to reflect that the translation is out of date. 

The screenshot on the left shows the translation interface for one of the workflow tutorials in the project Scribes of the Cairo Geniza, including one out of date section. 


Translated Projects in Numbers

Approximately 25% of Zooniverse projects have been translated into other languages. 

The chart on the left displays a list of the most common non-English language translations added to projects. The five most common are French (50), Spanish (30), German (15), Portuguese (14), and Dutch (11). 

The large number of projects available in French is largely thanks to one particularly keen Zooniverse volunteer, @veragon, who reached out to project teams to ask if they would like their project translated into French. As of November 2020, @veragon had collaborated with 20 project teams to provide French translations.









Challenges

Discoverability

Zooniverse does not currently offer the ability to search or filter projects by language. We are aware this is a major barrier to access, and is high on our list of features to implement once we have the resources to do so. In August 2020, volunteer @Melina_t began compiling a list of translated Zooniverse projects in a Talk comment thread. The list was limited to active projects only (i.e. projects currently accepting classifications, not projects that were paused or completed). The last update for this list was in August 2021. 

Working to foreground the availability of these tools in support of multilingual participatory research is key to its continued use.

Arabic version of Scribes of the Cairo Geniza (2017) https://www.zooniverse.org/projects/ar/judaicadh/scribes-of-the-cairo-geniza/classify?workflow=4712

English version of Scribes of the Cairo Geniza (2017) https://www.zooniverse.org/projects/judaicadh/scribes-of-the-cairo-geniza/classify?workflow=4712

RTL languages

Because Zooniverse projects are designed in English, translating them into languages that are read right to left requires more effort, as the whole project page must be flipped horizontally. While some Zooniverse pages are able to do this, most are not yet responsive. We've had to think carefully about other design choices in order to provide RTL language support for past projects (Blickhan et al. 2021). We use W3C CSS standards for typography and layout, including RTL languages, but there is still work to be done in order to create RTL language support for the entire platform.

Acknowledging translators

The Zooniverse translations infrastructure was born out of volunteer efforts to translate projects, and so the Translations UI was created with this method in mind, i.e. platform volunteers serving as project translators. Volunteers can be added to projects teams as Translators via the Project Builder interface, using their Zooniverse username. Any user(s) added as a Translator are automatically added to the project's Team page (.../project-name/about/team), but any acknowledgement of effort beyond that is up to individual project teams. Even in the Zooniverse, we can do a better job of this, by naming volunteers even in technical documentation (via GitHub docs, etc.) or gray literature. Accreditation for informal and volunteer labor is becoming more standard with time, but we are still grappling with invisible labor, especially in the Digital Humanities (Graban et al. 2019), where informal labor often results in informal acknowledgement, blanket statements thanking large groups of people, or no acknowledgement at all. 

Multilingual community support

Increased support for translations can lead to an increase in multilingual project community spaces, such as Zooniverse project message boards. As with all online communities, these multilingual spaces require moderation. Research teams who want to translate their projects into non-English languages should therefore also identify speakers of these languages to act as project moderators, to provide support and enforce community standards.

Automated translation

The increasing popularity of models like Open AI's Chat GPT are already being discussed in terms of their potential use for crowdsourcing tasks like transcription and translation. As with any automated process, concerns exist around quality of results, as well as the potential for algorithmic bias (Noble 2018). Early studies like Hendy et al. (2023) suggest that GPT translation results are higher quality for "high-resource" languages, while the models "still struggle with underrepresented languages, which makes it a critical research question to explore how to improve the translation quality for these languages." This presents an additional challenge, in its potential to perpetuate existing language biases based on availability of training data. 

Supplementary video