We used Tensorflow via Keras for implementing the Project.
We chose vgg16, which is already trained on ImageNet database, as the pre-trained Convolutional Neural Network for this project. vgg16 takes images of size 224x224.
The pre-processing part of the project takes care of certain properties like rotation invariance, shift invariance and skew invariance. The rotation and resizing issues are taken care in the pre-trained model. Modifying the last few layers of the convolutional neural network will not be dependent on 3 features of the images while execution. With this we are making sure the network does not overfit.
Data Augmentation via number of transformations helps to avoid over-fitting and to build a generalized model.
To "Fine-Tune" the last convolutional block of the VGG16 model alongside the top-level classifier. Fine-tuning consist in starting from a trained network, then re-training it on a new dataset using very small weight updates. In our case, this can be done in 3 steps:
We choose to only fine-tune the last block consisting of two fully-connected and logit layers, rather than the entire network. We aim at implementing transfer learning by using different dataset as well as we avoid overfit of the model. Since our pre-trained network is trained on a generalized dataset, implying that low-level blocks are trained for general parameters we freeze these blocks and train the top layer. The top-layer of the CNN model is fine tuned for specialized features as a result we only train the top-layer of the model.
For the given project, we fine tune the network using SGD(stochastic gradient descent) optimizer just to keep our magnitude as low as possible. This helps in retaining the the previously learned features for the frozen part of the network.