Authors: Nega Agmas Asfaw and Birhanu Hailu Belay

       Dataset Details

This dataset was created and made available online for free to researchers working on Amharic Braille document recognition. We have collected Braille documents from AAU-Kennedy library, Addis Ababa, Ethiopia. The documents include both typewritten and manually produced one sided documents, most of which are typewritten and noisy. For the variability of label sequences, the documents include the contents that deal for history of Ethiopia, law, agriculture and health, which are collected from real life degraded documents and scanned in low resolution (200dpi) using flat-bed scanner.  

The dataset contains 2100 Amharic Braille document Line-images, in which 2000 line-images are for training and 100 line images for testing with the corresponding Ground-Truth (GT) labels. Line-images are stored in grayscale with .Png file format and named with numbers starting from 0001 with their equivalent line number in the label text , and 248 characters have been included in the labels. For example: text in the first line of label.txt is transcription for image 0001.png(first image) in line images folder, and so on.