Simple Morphological Operations

Mathematical morphology, generally speaking, lets you search for patterns of bits in images.  It's a little bit like regular expressions for pixels.

OCRopus has several different implementations of mathematical morphology:
  • binary morphology on arrays using pixel operations (imglib/imgmorph.cc)
  • binary morphology on arrays using distance transforms (imglib/imgmorph.cc)
  • grayscale morphology on arrays (imglib/imggraymorph.cc)
  • binary morphology on packed bitmaps (imgbits/imgblit.cc)
  • binary morphology on runlength encoded images (imgbits/imgrle.cc)
We try to keep these different components somewhat consistent, but there are some differences between them.  These different implementations make tradeoffs in terms of functionality, performance, conversion costs, and additional operations.

If your needs are not performance critical, probably the binary morphology on arrays is your best starting point; those operations operate on simple arrays, just like any other image processing operation.

Available Operations

Here is a list of the basic operations for binary morphology available in imglib; these are callable from C++ or Lua:
  • void make_binary(bytearray &image);
    • convert the gray scale image to binary, in place
  • void binary_invert(bytearray &image);
    • invert the binary image
  • void binary_autoinvert(bytearray &image);
    • invert the image if the majority of pixels are 255
  • void binary_and(bytearray &image,bytearray &image2,int dx=0,int dy=0);
  • void binary_or(bytearray &image,bytearray &image2,int dx=0,int dy=0);
    • compute the boolean operation between the two images, optionally with a shifted version of the second image
  • void binary_erode_circle(bytearray &image,int r);
  • void binary_dilate_circle(bytearray &image,int r);
  • void binary_open_circle(bytearray &image,int r);
  • void binary_close_circle(bytearray &image,int r);
    • morphological operations with a circular structuring element
  • void binary_erode_rect(bytearray &image,int rw,int rh);
  • void binary_dilate_rect(bytearray &image,int rw,int rh);
  • void binary_open_rect(bytearray &image,int rw,int rh);
  • void binary_close_rect(bytearray &image,int rw,int rh);
    • morphological operations with a rectangular structuring element

Simple Example


Here is a simple script that "opens" an image; that is, it removes small features from an image, including isolated points and thin lines:

image = bytearray()
read_image_binary(image,arg[1])
binary_erode_circle(image,3)
write_png(arg[2],image)

Matra Clipping

Here's a simple binary morphology script that implements "matra clipping"; that is, it cuts apart the connecting line that link Devanagari or Bengali characters after vertical lines.  The idea is somewhat similar to this, except that this code doesn't use any projection operations.

-- the parameters are resolution dependent
min_width = 10 -- minimum width of matra lines
min_height = 8 -- minimum height of vertical lines causing interruption
clip_offset = 3 -- how far to offset the clipping from the vertical line

-- read the input image and invert (black background)
image = bytearray()
read_image_binary(image,arg[1])
binary_invert(image)

-- find horizontal lines
matra = bytearray()
narray.copy(matra,image)
binary_open_rect(matra,min_width,1)

-- find vertical lines
vert = bytearray()
narray.copy(vert,image)
binary_open_rect(vert,1,min_height)

-- find places where horizontal and vertical lines intersect
binary_and(matra,vert,0,0)

-- shift the intersection points by clip_offset and remove
binary_dilate_rect(matra,2,2)
binary_invert(matra)
binary_and(image,matra,clip_offset,0)

-- write out the result
binary_invert(image)
write_png(arg[2],image)

Here is an example of how this actually works in practice:

Input   
 Output
 
 

Because it uses only local information, it will clip some horizontal lines in addition the the matra (exercise: fix this).  On the other hand, the purely morphological code is more robust to other objects and noise being present in the document, since it only relies on local processing.  This distinction is probably academic, however, since there are better approaches to segmentation of Indic scripts available.