In this project we implement a highly parallel version of the Binary Robust Independent Elementary Features (BRIEF) feature descriptor using CUDA. The complete pipeline for keypoint matching is implemented in parallel using CUDA:
In the context of image processing, keypoints are pixel positions that are "significant" to the image. By significant we mean that that particular point is rich of unique information, so if we move the image, change the illumination or rotate this image, we will still be able to identify the same point.
Corners, for instance are an example of keypoints, because we are able to identify them even if the brightness in the image is changed or if the corner change its position in the image.
One typical rule of thumb to determine if a point is a keypoint or not, is to find if there is "change" across all directions surrounding the pixel. The following image (taken from the 16720 CMU course slides) describe three cases typically considered in keypoint detectors. It is important to note that a point may be a keypoint without necessarily being a corner.
The image below illustrate the keypoints detected in the image of an helicopter:
Keypoints are used extensively in computer vision for different applications such as:
Object matching
Given the template of an object find whether it is present or not in an image, and get its position.
Image homography estimation
Given two images, find out how to "go" from one image to another, i.e. what is the translation, rotation and scaling between the two images. This can be used for panoramic view creators using multiple images.
Once the keypoints in an image have been identified. The next step consists of assigning a "descriptor" to each of the keypoints that enable us to match that same keypoint in different images.
There are plethora of feature descriptors available providing different characteristics. The most relevant features are the following:
Some of the most popular feature descriptors used in computer vision are:
The process of keypoint matching can be divided in the following stages. Our plan is to implement ALL of the stages of the pipeline in parallel using CUDA:
In this stage the input is an image and the output is a set of image coordinates containing the keypoints. There are many algorithms to find the keypoints. We are using the Difference of Gaussians detector (DoG). Whose steps are described below:
1. Applying several (typically 8) Gaussian filters at different scales to the image.
2. (i) Stacking the N filtered images vertically
(ii) Subtracting the pixels from the "image above" on the stack, resulting in a N-1 stack of images.
(iii) Marking every point that is a local maximum (across neighbors in the same scale and contiguous scale) as a candidate keypoint.
3. Eliminate edges: Some of the candidates detected in the previous stage may correspond to edges. To eliminate edges from the candidates. We apply a test called the "curvature test". We compute the Hessian of each candidate pixel. And calculate 'R' as described in the left. If the point curvature is above a given threshold we discard the point.
The leftmost image shows the keypoints before and after applying the curvature test.
After finding all of the keypoints in the image, we need to assign a "tag" uniquely describing each of the keypoints. In this project we are implementing the BRIEF feature descriptor which has the following characteristics:
Below we describe the steps used to create the descriptor for each keypoint using BRIEF:
1. (i) Select a patch of pixels around the keypoint (5x5 or 7x7 for example)
(ii) Number all of the pixels in the patch, except for the keypoint
2. Randomly select a set of pairs from the numbered pixels around the keypoint (typically 128 or 256). This preselected set is in fact used for EVERY keypoint in all images.
3. Let I(P) be the intensity of pixel 'P'. We then create a binary feature descriptor by applying the following test to each pair:
In other words, we check whether the intensity of the first pair member is greater than the intensity of the second pair member. In such case we set the bit for that pair as '0' and to '1' otherwise. This creates a bit vector describing the keypoint with length equal to the number of pre selected pairs.
Now that each keypoint has a feature descriptor, we need to match keypoints between images. To do so with BRIEF, we usually calculate the smallest Hamming Distance between a given keypoint in a first image and the rest of the keypoints in the second image.
We will implement the code in C++ using CUDA to compute the heavy computations using a GPU. The reason why we chose C++ is because it is easy to integrate with OpenCV which is written in C++, and because C and C++ are much faster than most of the languages for tasks such as image processing, providing better memory management capabilities.