CS 639: Final Project - Optical Character Recognition (OCR)

Optical Character Recognition (OCR)

What is OCR And How it Works :

OCR stands for Optical Character Recognition. It is a technology that allows for the conversion of scanned images of text into machine-readable text. Since a lot of times it is difficult to read text from images, especially for people suffering from low vision, our implementation of the OCR algorithm will help solve the problem.

OCR analyzes the visual characteristics of the text using shapes and patterns. These visual characteristics are compared to a database of known characters to decode the what the text says. Sometimes, if no matches are found in the databases, OCR tries to use context clues, i.e. analyzes words around that specific character and tries making an educated 'guess.' After characters are detected, OCR converts them into a machine-readable format.

Implementation in VRT:

In Virtual-Reality-Toolkit (VRT) we implement OCR using OpenCV and Tesseract. We use OpenCV to convert the image to grayscale and apply Gaussian filtering to remove noise. We then use Tesseract to fetch text from the images, and output it to files.

Sample Output:

Page updated

Report abuse