Text Recognition

Character Spotting

Participants: Nishatul Majid (Fort Lewis College), Elisa Barney Smith (Luleå Tekniska Universitet)

Rather than segmenting a word into characters and recognizing the characters, we have adopted the approach of looking for every character in a word and based on detections identifying the resulting string. This works well for overlapping characters, and does not require a one-dimensional arrangement of the characters. This was developed for Bangla, but has also been successfully tested on Korean and English. Work is in progress to re-implement the algorithm using Python and switch the object spotting engine to YOLO. Code, YOLO weights and data will be publicly shared.

Publications:

Online Handwriting Recognition

Particpants: Sukhdeep Singh, Elisa H. Barney Smith

Online text recognition considers the time sequence of the stroke points as well as their position. Tablets and other touch devices often oversample to assure they have adequate stroke information. Not all these points are necessary. We developed an approach that selects a subset of the points to enable a simpler training, while maintaining high recognition accuracy.

Publications:

Other Text Recognition Projects

There are many things that can be done to improve text recognition document analysis. I have been fortunate to collaborate with several people on assorted projects over the years. Here are some of the results of those collaborations.

Publications: