transcribe. Together.

Institute overview

Welcome to the Building Capable Communities for Crowdsourced Transcription Institute. We are a National Endowment for the Humanities Office of Digital Humanities-funded institute that seeks to build capacity in diverse cultural organizations and research institutions for using crowdsourcing to effectively transcribe large collections of handwritten documents. In the autumn of 2020 we are recruiting a cohort of 15 project leaders to join us for 18 months of project and community building. Participants will have an existing large collection of digitized documents and ambitious goals for the impact that its completed transcription would bring. The cohort and project directors will work together to build infrastructure for transcription and develop the volunteer community that will help achieve your project goals. Together we will build a community of leaders in online crowdsourcing.

Email: Twitter: @atdhcrowdcohort

Call for participants

Building Capable Communities for Crowdsourced Transcription

We are delighted to announce that we are seeking 15 participants for an NEH-funded institute, “Building Capable Communities for Crowdsourced Transcription,” which will run from Spring 2021 through Fall 2022. The institute will be primarily virtual.

Applicants are welcome from a variety of backgrounds: GLAM professionals, academics, contingent workers, graduate students and unaffiliated researchers are all welcome to apply. Cohort members must be US-based. We will begin considering applications on December 9.

Institute aims

Crowdsourcing has the potential to vastly diversify the audience that can engage with rare documents, democratizing participation in the work of the humanities and letting a broad public engage with rare or fragile materials that would otherwise be confined to libraries, archives or other institutions. But for many types of documents—including manuscripts, complex tables, fragments, and indices—the practice of transcribing isn’t straightforward, as volunteers will need to deal with multidirectional text, damaged documents, and markup, among other issues that will not only affect the transcription output, but the volunteer experience.

This institute aims to assist researchers who wish to use crowdsourced text transcription in their research methods. It will provide resources to assist and support participants in developing crowdsourced transcription projects, as well as serving as dedicated spaces for communities of crowdsourcing practitioners to share resources, ask questions, and have discussions. We will provide training in specific tools and project design, and build a network of institutions and scholars working on crowdsourced transcription.

Participants will learn the common challenges in crowdsourced transcription that generalize beyond the research question or specific aims of any given project. By using a cohort model, participants will benefit from discussion, assistance, and interaction with other project creators working towards a similar goal (crowd-transcribed text). Thus, participants will be well-placed to develop other projects in the future, and to assist peers in their professional and local networks in using the web and software tools they have learned.


  • Over 18 months, we will train cohort members in using the Zooniverse Project Builder to build crowdsourcing projects with text transcription as their main goal.

  • The institute will consist of 2 workshops and 6 virtual meetings, held at approximately 2-month intervals, beginning in spring of 2021 and concluding in fall of 2022.

    • The content of workshops and virtual meetings will include the following topics:

      • How to use the Zooniverse Project Builder

      • Building, testing, experimenting, and iterating

      • Retirement rates (i.e. what is the “completeness” metric for your task?)

      • Communication and sustainability

      • The ethics of working alongside volunteers

      • Project management

      • Data analysis

      • Project archiving and publishing results

  • The 6 virtual meetings will be held via Zoom. Ideally, the 2 workshops will be held in person, at the University of Minnesota-Twin Cities campus. Room and board for the in-person workshops will be covered by the institute. We anticipate the closing workshop (fall 2022) being in-person. An earlier in-person workshop will be held midway though the institute, but only if it is safe to do so.

  • The cohort will be made up of individuals, but additional team members are welcome to join virtual meetings. Note: while cohort members must be US-based, team members are not required to be.

  • Meetings will be held during Central Standard Time business hours.


Each cohort member (and their respective team) will create and run a crowdsourced transcription project over the 18-month duration of the institute (15 projects total).


Successful institute applicants will demonstrate that they have:

  • A large digitized collection (or collections) that needs transcribing and which is not under copyright (or for which you have redistribution rights)

  • A vision for the research, pedagogical or public presentation they will make with the transcriptions (this may include the creation of searchable digital text for inclusion in a database or CMS, training data for machine learning models, critical editions of text, etc.)

Only one applicant per project proposal will be allowed to join the cohort. Additional project team members will be encouraged to participate in the virtual meetings.


Who We Are

Evan Roberts

Assistant Professor of Sociology and Population Studies

University of Minneaota

Samantha Blickhan

Digital Humanities Lead for Zooniverse

The Adler Planetarium

Benjamin Wiggins

Director of the Digital, Arts, Sciences, & Humanities Program and Affiliate Assistant Professor of History

University of Minnesota