Sample images of Warped Document Image Dataset (WDID)

Here, some of the sample image of our dataset is shown. Though the dataset is created to generate an efficient de-warping algorithm for alpha-syllabary script (like Bangla, devanagari etc.) this dataset can also be used to create more effective and robust techniques like binarisation, marginal noise reduction, optical character recognition etc. The images is captured in both ideal and non-ideal lighting condition. We also have considered text with a variety in fonts-type, font-size. The images are taken from books, magazines, printed documents, advertisement page, news papers etc. The images in the data-set also have variety in terms of warping i.e. single fold, multi-fold etc.

RGB Images

Binarised Images

Only Text Part of the Images

Please feel free to contact me to get the WDID.

Page updated

Google Sites

Report abuse