Transcription Task

Research & development: custom tools

The concept for the Transcription Task was based on one of the research questions from the Transforming Libraries and Archives through Crowdsourcing project:

"Does the current Zooniverse methodology of multiple independent transcribers and aggregation render better results than allowing volunteers to see previous transcriptions by others or indeed collaborate to create a single transcription? How does each methodology impact the quality of data, as well as depth of analysis and participation?"

The method of independent transcription followed by aggregation of results is rooted in the platform's genesis in STEM research projects, where siloed participation is necessary to avoid bias in results. While the concept of bias is very real in the humanities, this research question was largely based in the reality that transcription of historic texts is often subjective, and largely dependent on context, leading to the theory that collaboration could lead to higher-quality results.

To answer this question, we designed an A/B experiment on the crowdsourcing project Anti-Slavery Manuscripts at the Boston Public Library. This allowed us to directly compare the data output from independent vs. collaborative methods. The results of the study were published in December 2019 as Blickhan et al., "Individual vs. Collaborative Methods of Crowdsourced Transcription," and ultimately found that the collaborative approach produced higher-quality data than those collected individually. Based on these findings, we decided to add the collaborative method to the Project Builder toolkit for widespread use.

Development on the custom tools began in May 2017, and was completed in November 2018. Maintenance for the custom tools ended on August 12, 2020, when the Anti-Slavery Manuscripts project was completed.

The Project Builder version of the tools was beta tested from March - April 2020. The first live project to use the Transcription Task launched in May 2021, and the second in June 2021. As of writing, there are an additional ten projects using the Transcription Task currently in development.

Generalized version: a walkthrough

All image examples are from the Boston Public Library Anti-Slavery Collection.

The first volunteer to see a document annotates & transcribes the text.

Subsequent volunteers are able to engage with previous transcriptions. Annotations made by other volunteers are pink. Annotations which are currently being interacted with are green. Annotations which have been completed in the current session are blue.

When 3 different volunteers have submitted the same string of text for an annotation (i.e. 'agreed' on a transcription), the line is considered complete, and greyed-out. This indicates to volunteers that no more transcriptions are needed for that line, and they should focus their efforts elsewhere.

For each session, volunteers are asked whether or not all the annotations on an image have turned grey. Once 3 volunteers have responded 'Yes' to this question, the image is removed from the project.

Try it

To explore the Transcription Task on a test project, click here.

To read more about setting up a project with the Transcription Task, click here.


Next Section

ALICE: Aggregate Line Inspector & Collaborative Editor