Code

For our experiments we use the NVIDIA DIGITS webapp (Link to DIGITS) which we run on top of the Caffe Deep Learning Framework (Link to Caffe).

For the data-augmentation layer we employ a two step process. We first extract object crops from original images using bounding boxes annotations from ImageNet, creating an object dataset. We then provide a python layer for DIGITS to execute on-line data-augmetation on the new dataset.

Crop Extractor

Link to github

Usage. The generate_crop_dataset function takes 5 arguments:

annotations_folder: folder in which ImageNet bounding boxes annotations are stored.
images_folder: root folder of the ImageNet dataset.
target_folder: target root folder. The new dataset will be created in this folder.
dataset_synsets_file: Text file providing the list of synsets we want to extract crops from. One synset for each line.
done_synsets_dump_file: Dump file for resuming purposes.

Random Zooming Layer

Link to github

This DIGITS python layer takes a random size patch of the input image, then uses opencv to resize the patch to expected dimensions.

The zooming range can be changed by altering crop_min_size and crop_max_size attributes of the layer class.

The output_dim attribute of the layer class defines the resizing dimensions.

Google Sites

Report abuse