351- Optical Character Recognition

Segmentation

This section aims to walk through the step by step process our team used to surround individual characters seen in images. As stated previously in the report, our segmentation module made use of an algorithm called Maximally Stable Extremal Regions (MSER), already implemented for us on Matlab with the detectMSERFeatures function. The following Matlab figure below to the left shows the regions identified as maximally extremal stable, while the figure next to it shows an additional filtering algorithm used to remove non-text regions based on their geometric properties:

(Left) MSER regions detected, (Right) MSER regions after removing extraneous non-text regions

The mathematics behind the algorithm can be seen in the following snippet:

The function gave us information about its BoundingBox properties, specifically the four indices of each bounding box, contained in xmin, xmax, ymin, and ymax one-directional vectors. The issue with these vectors is that it contained multiple duplicates of the same bounding boxes and that they were not sorted in any particular order. Thus, our module ameliorated this issue by getting rid of duplicates in the vectors as well as sorting them according to ascending xmin size, and compiled all of these indices into a four-column two-dimensional vector, with the row size depending on how many bounding boxes there were for our image (see segmentation module for our code). The next issue was getting rid of bounding boxes that were not necessarily around the entirety of characters, but around parts of characters. Thus, we implemented an algorithm within the bounding box two-dimensional vector that checked for duplicates that had indices that were entirely inside another bounding box and got rid of those bounding boxes coordinates. The two figures below show before getting rid of unwanted bounding boxes on the left and after getting rid of them to the right:

Bounding boxes that are unwanted are present

Bounding boxes only around actual characters

We started trying our segmentation algorithm on handwritten letters (for A, B, C, D, E) and got the following results before getting rid of duplicates:

(Top left) Our original image, (top right) applying our pre-processing methods, (bottom left) segmentation with unwanted bounding boxes

After applying the algorithm that gets rid of unwanted bounding boxes, we got the following:

The bounding boxes after getting rid of duplicates

As can be seen above, our algorithm for getting rid of duplicates ended up getting rid of the last bounding box that was around ‘E’. From all of these different images and results, it is clear to us that our segmentation module could be improved such that it gets bounding boxes around handwritten characters. It could have helped for scanning in our handwritten capital letter dataset rather than cropping each individually out and saving them as an image. As stated in our conclusion, it could have practical applications, as well, including word and sentence recognition.

Report abuse