For our experiments we use the NVIDIA DIGITS webapp (Link to DIGITS) which we run on top of the Caffe Deep Learning Framework (Link to Caffe).
For the data-augmentation layer we employ a two step process. We first extract object crops from original images using bounding boxes annotations from ImageNet, creating an object dataset. We then provide a python layer for DIGITS to execute on-line data-augmetation on the new dataset.
Usage. The generate_crop_dataset function takes 5 arguments:
This DIGITS python layer takes a random size patch of the input image, then uses opencv to resize the patch to expected dimensions.
The zooming range can be changed by altering crop_min_size and crop_max_size attributes of the layer class.
The output_dim attribute of the layer class defines the resizing dimensions.