Methodology Discussion

Post date: Feb 18, 2012 4:37:51 AM

It seems like the simplest way would be to take a snapshot and compare it to a list of possible matches. If I was trying to match a cat in the real world this would be an unfathomable number of potential options. Since I am working with a MUCH simpler set of possibilities this number is brought down to size quite a bit.

In order to start messing with the technology I'm going to start by just doing a comparison of a given set of bitmaps to another. It strikes me that since each pixel is black or white that sounds a whole lot like binary and it's possible to create a grid similar to the way that the Carney algorithm compares 3x3 grids.I just want to compare larger ones. It strikes me that 8x8 seems a little too small but 16 X 16 might work.

Playing around with some sample images in Gimp it looks like 16x16 is still too small to adequately grab the target shape from the full 128x96 image. However taking a cat profile and shrinking it to 60x60 it seems to still have a high enough resolution to adequately determine that it is a cat and now the nose, mouth and chin will fit within a 16x16 pixel grid. See attached.

To minimise the time taken to process and check for the chin a ffew shortcuts can be taken. First off we don't need to match the whole head. Just the nose, moth and chin section. Second we are simplified because we will be dealing with some very controlled images. They will be backlight black and white profiles, all facing the same direction and we will have some meausure of control over when the image is taken. so that the cat will be in a good spot in the image so there's no need to check the upper right hand corner of the total image to check for the chin so we don't need to scan all 60x60 lines as the center of our match.

Assuming that the construction of the tunnel forcing the cat past the camera is adequate it should be unnneccesary to scan anything above the top 1/2 of the image. Taking a sample 16x16 means we won't need to start anywhere where the sample might hang over the edge. Say from horizontal points 8-52. which is 44 possible positions. That gives us 1320 possible center points. While our template matching process is going to be MUCH more intensive than the 3x3 bitmap grid the samples in the Nootropic library are running those algorithms against the complete image 128x96 giving us 12288 checks per image.

I'm sure the template match isn't the best way to do it but it seems like the simplest and a better way to get my feet wet with this new technology.