For convenience, we have packaged D-Nets for OpenCV (version >=2.3) as a single file:
D-Nets is released under a BSD license, which is free for both academic and commercial use, and is also the license under which OpenCV is released.
The instructions below assume that your OpenCV 2.3.1 source code is located at /home/xy/opencv
Please adapt the instructions accordingly if your source tree is elsewhere.
OpenCV can be downloaded from http://sourceforge.net/projects/opencvlibrary/files/opencv-unix/2.3.1/ .
How to produce competitive results
The quality of the maching results increases with an increasing number of strips. While other matching approaches get worse with a higher number of keypoints, D-Nets typically improves because the number of strips increases too -- and faster than the nodes. In the code here, the relation is total and irreflexive, which means that the number of strips increases quadratically with the number of nodes.
Employing a high number of strips requires a large hash-table to avoid overflow. As a rule of thumb, the number of hash-table buckets (= number of d-tokens) should be of the same order of magnitude as the number of extracted strips. The number of sections into which a strip is divided (e.g "nS=13") directly controls the number of possible d-tokens and hence the size of the hash-table. The number of d-tokens is (4^nS) if each section is quantized with 2 bits.
For good results, D-Nets needs a lot of strips and hence large hash-tables. As a consequence, D-Nets is memory intensive.
To produce competitive results, you should run the code on a 64 bit platform with at least 16 GB (better 32 GB) of memory, or a sufficiently large swap space. Some example calls of d-nets are:
This call would use FAST keypoints as nodes, extracting them with a threshold of 40, use 13 segments per strip, perform no geometric verification, and output the 40 best correspondences, visualizing them by drawing lines from a point in the left image to its respective point in the right image.
For geometric verification, a RANSAC step will run on the extracted correspondences. The visualization only shows those verified correspondences. Still nodes that haven't passed the verification step are visualized, too. The flag (vis=MESHES) changes the visualization from the traditional line-visualization to a new visualization (mesh-visualization).
The reason for a new visualization is that with the traditional way of visualizing correspondences through lines, it is often hard to visually follow the lines to their endpoints, because the lines occlude large parts of the images at the same time.
Instead of connecting corresponding points by lines throughout the images, which quickly becomes cluttered, correspondences are shown by topologically equivalent meshes in the left and right images. If the correspondences are incorrect, the respective mesh in the right will appear "scrambled". Here is an example for a correct result.
These meshes are produced by starting in the left image, constructing a mesh on the points that have a correspondence, and then transferring the mesh to the right via the correspondences. It can be seen that the meshes are topologically equivalent and nicely show the correspondences with but a glance. Notice that the geometric verification step is only
used for better visualizing the results. The precision-recall graphs in the paper have of course been produced without such a geometric verification step.
The following call matches two images with a large scale change.
./d-nets bark/img1.ppm s=1 bark/img6.ppm s=1 kx=FAST FAST_thres=20 nS=13 gm=HOMOGRAPHY gv_dis=8 vis=MESHES
Here, the FAST_thres has been lowered to 20, to have a sufficient number of nodes for matching. Here is the result:
If you lack sufficient memory (32-bit platform)
If you are running the code on a 32-bit platform or you don't have an amount of memory as suggested above, you cannot produce competitive results. However, you can still experiment with the code and get correct matching results. To restrict the size of the hash-table and the number of strips, try to reduce the number of nodes extracted and set the flag "nS" to lower values, e.g (nS=6,7,8,9). A simple way of limiting of reducing the number of nodes, is to use reduced-resolution images. For instance,
./d-nets boat/img1.pgm s=0.5 boat/img2.pgm s=0.5 kx=FAST FAST_thres=80 nS=6 gm=HOMOGRAPHY gv_dis=4 vis=MESHES
produces quite good results, even though nS is only 6.
Here are some example calls and results:
./d-nets building/img1.jpg w=640 building/impg w=640 kx=FAST FAST_thres=40 nS=11 gm=HOMOGRAPHY gv_dis=8 vis=MESHES
./d-nets building/img1.jpg w=640 building/img2.jpg w=640 kx=FAST FAST_thres=40 nS=11 gm=NONE nM=40 vis=MESHES
will build the visualization meshes only on the best 40 matches (best according to the quality measure in the paper).
However, the mesh visualization will only look nice if all 40 matches are correct, i.e., if the precision of D-Nets is 100%. This is the case for many matching cases as can be seen in our paper, but typically not for all, in particular not those involving a large scale change.
The source code for D-Nets that was used to produce the results of our paper is part of a larger computer vision framework that has not been released publicly. However, we have extracted the D-Nets portion, replacing our custom image processing operations by OpenCV calls, where possible. Doing so, we replaced some of our operations (like the pyramid generation) by OpenCV functions. Because of those modifications, the code released here may generate slightly different results than those shown in the paper.
We decided to put all the code in a single file. While this makes it more difficult to get an overview of the code, compiling and sharing it with others becomes very easy in this way. The file can just be copied into OpenCV's sample/cpp folder for compilation.
Please contact the authors if you have comments or suggestions regarding D-Nets.