Untitled‎ > ‎HD-CNN‎ > ‎

HD-CNN Implementation


GitHub


Features
  • HD-CNN is implemented on Caffe.
  • HD-CNN supports 2-GPU training and testing. The 2 GPUs should have Peer-to-Peer access.

Building

Get the 'hdcnn' branch from the github repository.
  • git clone https://github.com/stephenyan1984/caffe-public
  • git fetch origin
  • git checkout -t origin/hdcnn
Please follow the Caffe installation guide to build HD-CNN.
In a nutshell, type 'make' in project root folder to build HD-CNN after installing all the prerequisite packages.

Multi-GPU Training & Testing

To have 2 GPU with Peer-to-Peer access to perform training, simply append the following lines to your solver prototxt file. An example solver prototxt file is shown below.
----------------
...
...
solver_mode: GPU
device_id: 0
device_id: 1
----------------
This will have two GPUs with gpu id 0 and 1 to perform training simultaneously.

To have 2 GPU work in testing stage, add option '--gpu=0,1' when you launch binary tool 'build/tools/caffe test --gpu=0,1'.

Code Usage

Experiments on CIFAR100

Make sure disable the use of CUDNN by setting "USE_CUDNN := 0" in Makefile.config. The implementation of CUDNN convolution operation has unknown bugs on CIFAR100 dataset.
To reproduce the results on CIFAR100 dataset using CIFAR100-NIN as the building block net, follow the steps below.

Training
  • Split training set 50K images into a 'train_train' training set (40K images) and a 'train_val'  validation set (10K images).  The images in 'train_train' set and 'train_val' set are listed in 'train_train.txt' and 'train_val.txt' in folder ''data/cifar100/'. Then prepare leveldb database for them respectively. For simplicity, I've prepared the leveldb files. Just run the script below to get them.
    • script: './examples/cifar100/get_cifar100_float_train-train-val-leveldb.sh'
  • Train a CNN using 'train_train' set as training data and 'train_val' set as testing data. 
    • command: ./examples/cifar100/train_cifar100_NIN_float_crop_v2_train_val.sh
  • From the trained net in the previous step, extract features from layer 'poolg' from validation set & test set and save them into a levedb file.
    • command: ./examples/cifar100/extract_features_val_set_cifar100_NIN_float_crop_v2.sh
    • command: ./examples/cifar100/extract_features_test_set_cifar100_NIN_float_crop_v2.sh
  • Open ipython notebook  below and execute the cells. It generates net definition files for pretraining rear layers of fine category components. In addition, the net definition file of the final HD-CNN is generated as well.
    • ipython notebook: './examples/cifar100/hdcnn/python/cifar100_NIN_raw.ipynb'
  • Pretrain rear layers of fine category components.
    • command: './examples/cifar100/finetune_NIN_float_crop_9clusters_cluster_confusion_mat_v0.0.sh'
  • Finetune the HD-CNN
    • command:  './examples/cifar100/finetune_hdcnn_from_NIN_float_crop_v2_9clusters_v0.0.sh'
Testing:
  • Get the preprocessed CIFAR100 training and testing data.
    • command: cd ./data/cifar100; ./get_cifar100_float.sh
  • Create training and testing LMDB database
    • command: ./examples/cifar100/create_cifar100_float.sh
  • Use a pretrained HD-CNN to do single-view testing.
    • command: ./examples/cifar100/test_hdcnn_cifar100_NIN_float_crop_v2_9clusters_v0.0.sh
  • Use ipython notebook below to perform single view & 10-view test
    • ipython notebook: ./examples/cifar100/hdcnn/python/cifar100_classification.ipynb

Experiments on ImageNet (ImageNet-NIN building block net)

For experiments on ImageNet, I suggest you rebuild up Caffe after changing "USE_CUDNN := 0" to "USE_CUDNN := 1" in Makefile.config. Using CUDNN can reduce the memory footprint of Convolution layer.

Training:
  • Split training set 1.2M images into a 'train_train' training set (1.1M) and a 'train_val' validation set (100K) images. The images in 'train_train' set and 'train_val' set are listed in 'train_train.txt' and 'train_val.txt' in folder ''data/ilsvrc12/'. A ImageNet-NIN net is trained using the 'train_train' set. Afterwards, it is evaluated on the 'train_val' validation set. I extract the 1000-class probability on validation set and save them into a numpy file. Run the script to get it.
    • script: ./examples/imagenet/hdcnn/NIN/get_train_val_pred_prob.sh
  • Open the ipython notebook below and execute the cells. It generates net definition files for pretraining rear layers of fine category components. In addition, the net definition file of the final HD-CNN is generated as well.
    • ipython notebook: './examples/imagenet/hdcnn/python/imagenet_NIN.ipynb'
  • Pretrain rear layers of fine category components
    • command: ./examples/imagenet/get_nin_imagenet_pretrained_net.sh ; ./examples/imagenet/finetune_nin_imagenet_89clusters_cluster_confusion_mat_v2.0.sh
  • Optionally, finetune the HD-CNN. I usually don't finetune the HD-CNN here as it's too slow. Perhaps the main purpose of running the finetuning script is to save a single checkpoint file for the HD-CNN at the iteration 0 (a.k.a. before the finetuning begins)/
    • command: ./examples/imagenet/finetune_hdcnn_nin_imagenet_89clusters_v2.0.sh
  • Use ipython notebook below to perform single view & 10-view test
    • ipython notebook: ./examples/imagenet/hdcnn/python/imagenet_NIN_classification.ipynb

Testing:

To reproduce the results on ImageNet dataset using ImageNet-NIN as the building block net, follow the steps below.
    • Create training and testing LMDB databases 
      • command: ./examples/imagenet/create_imagenet_encoded.sh
    • Get trained HD-CNN model file (.caffemodel)
      • command: cd models/nin_imagenet/89clusters/89clusters_v2.0; ./get_trained_model.sh
    • Use the trained HD-CNN to do single-view testing
      • command: ./examples/imagenet/test_hdcnn_nin_imagenet_89clusters_v2.0.sh
    Experiments on ImageNet (ImageNet-VGG-16-layer building block net)

    Testing:
    To reproduce the results on ImageNet dataset using ImageNet-VGG-16-layer as the building block net, follow the steps below.
    • Get trained HD-CNN model files (.caffemodel)
      • command: cd models/VGG_ILSVRC_16_layers/211839e770f7b538e2d8/84clusters/84clusters_v2.0; ./get_trained_models.sh
    • Do single-scale single-view testing
      • command: ./examples/imagenet/test_hdcnn_VGG_16_layer_84clusters_v2.0.sh
      • note: the single scale can change by modifying the values of 'resize_short_side_min' and 'resize_short_side_max' in line 40,41 of net definition file 'hdcnn_train_val.prototxt'. Currently, both of them are set to 384, which means the single scale is 384.
    • Do single-scale dense testing
      • command: ./examples/imagenet/test_hdcnn_VGG_16_layer_84clusters_v2.0_dense_testing.sh
      • note: the single scale can change by choosing to use a different net definition file. Optional definition files are 'hdcnn_train_val_dense_testing_scale{256,384,512}.prototxt'

        References

        HD-CNN is released under the BSD 2-Clause license, the same one used by the original Caffe. If you use HD-CNN code, please kindly cite the paper below.

        @inproceedings{yanhd,
          title={HD-CNN: Hierarchical Deep Convolutional Neural Network for Large Scale Visual Recognition},
          author={Yan, Zhicheng and Zhang, Hao and Piramuthu, Robinson and Jagadeesh, Vignesh and DeCoste, Dennis and Di, Wei and Yu, Yizhou},
          booktitle = {ICCV'15: Proc. IEEE 15th International Conf. on Computer Vision},
          year = {2015} 
        }

        Comments