Compiling and using D-Nets software in OpenCV

This page describes how to compile and use the D-Nets software.  Please cite the following paper if you use this code:

Felix von Hundelshausen and Rahul Sukthankar, "D-Nets: Beyond Patch-Based Image Descriptors". In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2012.

For convenience, we have packaged D-Nets for OpenCV (version >=2.3) as a single file:

D-Nets is released under a BSD license, which is free for both academic and commercial use, and is also the license under which OpenCV is released.

How to build and run D-Nets under OpenCV on Linux

The instructions below assume that your OpenCV 2.3.1 source code is located at /home/xy/opencv
Please adapt the instructions accordingly if your source tree is elsewhere.
OpenCV can be downloaded from .

  1. Copy d-nets.cpp to /home/xy/opencv/samples/cpp
  2. Go to your cmake build directory, e.g., /home/xy/build/opencv (assuming that you have compiled OpenCV yourself).
    If you did not compile OpenCV on your own, do it by:
    mkdir -p /home/xy/build/opencv  (create a build directory)
    cd /home/xy/build/opencv       (go into the build directory)
    cmake . /home/xy/opencv          (the dot is important)
  3. Inside /home/xy/build/opencv edit the file "CMakeCache.txt" to ensure that the following flag is set:
  4. Execute the following commands within /home/xy/build/opencv
    cmake .
    make install (if desired)
    These will compile and install OpenCV and the sample programs.
  5. Go to /home/xy/build/opencv/bin, where you will find binaries of all the sample programs including d-nets.
    The command to run d-nets is:
    ./d-nets [image1] s=0.5 [image2] s=0.5
    where [image1] and [image2] are paths to two images.  For example, when using images from the graffiti dataset available at:
    The command would be:
    ./d-nets boat/img1.pgm s=0.5 boat/img2.pgm s=0.5
To get a list of all command line options, run:
./dnets -h

How to produce competitive results

The quality of the maching results increases with an increasing number of strips. While other matching approaches get worse with a higher number of keypoints, D-Nets typically improves because the number of strips increases too -- and faster than the nodes. In the code here, the relation is total and irreflexive, which means that the number of strips increases quadratically with the number of nodes.

Employing a high number of strips requires a large hash-table to avoid overflow. As a rule of thumb, the number of hash-table buckets (= number of d-tokens) should be of the same order of magnitude as the number of extracted strips. The number of sections into which a strip is divided (e.g "nS=13") directly controls the number of possible d-tokens and hence the size of the hash-table. The number of d-tokens is (4^nS) if each section is quantized with 2 bits.
For good results, D-Nets needs a lot of strips and hence large hash-tables. As a consequence, D-Nets is memory intensive.

To produce competitive results, you should run the code on a 64 bit platform with at least 16 GB (better 32 GB) of memory, or a sufficiently large swap space. Some example calls of d-nets are:
  • ./d-nets boat/img1.pgm s=1 boat/img2.pgm s=1 kx=FAST FAST_thresh=40 nS=13 gm=NONE nM=40 vis=LINES
This call would use FAST keypoints as nodes, extracting them with a threshold of 40, use 13 segments per strip, perform no geometric verification, and output the 40 best correspondences, visualizing them by drawing lines from a point in the left image to its respective point in the right image.
  • ./d-nets boat/img1.pgm s=1 boat/img2.pgm s=1 kx=FAST FAST_thresh=40 nS=13 gm=HOMOGRAPHY gv_dis=8 vis=MESHES
would add a geometric verification step by "gm=HOMOGRAPHY" and visualize the results differently (vis=MESHES).
For geometric verification, a RANSAC step will run on the extracted correspondences. The visualization only shows those verified correspondences. Still nodes that haven't passed the verification step are visualized, too. The flag (vis=MESHES) changes the visualization from the traditional line-visualization to a new visualization (mesh-visualization).
The reason for a new visualization is that with the traditional way of visualizing correspondences through lines, it is often hard to visually follow the lines to their endpoints, because the lines occlude large parts of the images at the same time.

Instead of connecting corresponding points by lines throughout the images, which quickly becomes cluttered, correspondences are shown by topologically equivalent meshes in the left and right images. If the correspondences are incorrect, the respective mesh in the right will appear "scrambled". Here is an example for a correct result.

These meshes are produced by starting in the left image, constructing a mesh on the points that have a correspondence, and then transferring the mesh to the right via the correspondences. It can be seen that the meshes are topologically equivalent and nicely show the correspondences with but a glance. Notice that the geometric verification step is only
used for better visualizing the results. The precision-recall graphs in the paper have of course been produced without such a geometric verification step.

The following call matches two images with a large scale change.

./d-nets bark/img1.ppm s=1 bark/img6.ppm s=1 kx=FAST FAST_thres=20 nS=13 gm=HOMOGRAPHY gv_dis=8 vis=MESHES

Here, the FAST_thres has been lowered to 20, to have a sufficient number of nodes for matching. Here is the result:

If you lack sufficient memory (32-bit platform)

If you are running the code on a 32-bit platform or you don't have an amount of memory as suggested above, you cannot produce competitive results. However, you can still experiment with the code and get correct matching results. To restrict the size of the hash-table and the number of strips, try to reduce the number of nodes extracted and set the flag "nS" to lower values, e.g (nS=6,7,8,9). A simple way of limiting of reducing the number of nodes, is to use reduced-resolution images. For instance,

./d-nets boat/img1.pgm s=0.5 boat/img2.pgm s=0.5 kx=FAST FAST_thres=80 nS=6 gm=HOMOGRAPHY gv_dis=4 vis=MESHES

produces quite good results, even though nS is only 6.

Additional Examples

Here are some example calls and results:
  • using densely sampled points as nodes for D-Nets
        ./d-nets wall/img1.ppm s=1 wall/img6.ppm s=1 kx=DENSE_SAMPLING ds_spacing=10 nS=12 gm=HOMOGRAPHY gv_dis=8 vis=MESHES

  • D-Nets does a good job in the presence of repeating structures, because the strips can be quite long, and help to encode more global information. The following call uses the following images, at the same time showing how to downscale them to a fixed width.

    ./d-nets building/img1.jpg w=640 building/impg w=640 kx=FAST FAST_thres=40 nS=11 gm=HOMOGRAPHY gv_dis=8 vis=MESHES

  • If you do not include a geometric verification step, you can still visualize the results using the mesh-visualization, however it makes sense to set the "nM"-flag, limiting the number of matches (in order of their quality) that is used for mesh construction (here nM=40).
./d-nets building/img1.jpg w=640 building/img2.jpg w=640 kx=FAST FAST_thres=40 nS=11 gm=NONE nM=40 vis=MESHES

will build the visualization meshes
only on the best 40 matches (best according to the quality measure in the paper).

However, the mesh visualization will only
look nice if all 40 matches are correct, i.e., if the precision of D-Nets is 100%. This is the case for many matching cases as can be seen in our paper, but typically not for all, in particular not those involving a large scale change.

Remarks on the code

The source code for D-Nets that was used to produce the results of our paper is part of a larger computer vision framework that has not been released publicly.  However, we have extracted the D-Nets portion, replacing our custom image processing operations by OpenCV calls, where possible. Doing so, we replaced some of our operations (like the pyramid generation) by OpenCV functions. Because of those modifications, the code released here may generate slightly different results than those shown in the paper.

We decided to put all the code in a single file. While this makes it more difficult to get an overview of the code, compiling  and sharing it with others becomes very easy in this way. The file can just be copied into OpenCV's sample/cpp folder for compilation.

Please contact the authors if you have comments or suggestions regarding D-Nets.

Subpages (1): src