Tasks

Two tasks are proposed and related with the basic workflow in a transcription process: extraction of text from scanned documents (OCR) and curation of the extract text to fix found errors. But, instead of dedicating a specific task to each step, we encourage participants to overcome the following tasks:

Task 1. Error Correction

Task 2. End-to-end Extraction

Task 1. Error Correction

In this task, participants are provided with the output of an OCR system and are asked to generate clean and fixed versions of the extracted texts.

Task 2. End-to-end Extraction

Due to the advance in multimodal systems, this task aims to explore end-to-end approaches, using scanned pages as input and expecting to produce curated texts as output.

Page updated

Google Sites

Report abuse