The purpose of the transfer learning is to fine-tune a pretrained convolutional neural network to perform classification on our new collection of images.
According Mathworks, "AlexNet has been trained on over a million images and can classify images into 1000 object categories (such as keyboard, coffee mug, pencil, and many animals)." However, we are looking for a classification accuracy over 99% on a given objects. Transfer learning is commonly used in deep learning applications. We can take a pretrained network and use it as a starting point to learn a new task. Fine-tuning a network with transfer learning is usually much faster and easier than training a network with randomly initialized weights from scratch. We can quickly transfer the learned features onto a new task using a smaller number of training images.
In order to perform a transfer learning, we have to create an image data store. Taking thousands of pictures with the traditional method will be time consuming and inefficient. Yet, this can easily be performed by using our smartphone as a webcam and the Matlab snapshot
function to acquire one image frame at a time from an IP camera using a for loop
. Also, imwrite(A,filename)
writes image data A
to the file specified by filename
, inferring the file format from the extension. Once the IP cam is connected, we can take continues pictures automatically by writing a simple looping Matlab script.
video.mjpeg
or mjpg.cgi
, as defined by the camera vendor. For example: cam = ipcam('http://172.28.17.104/video/mjpg.cgi')
cam
, using the URL of the IP camera as shown above. Username
can be created as a character vector representing the user name for the IP camera, and must be the second argument.Password
can be created as a character vector representing the password for the IP camera, and must be the third argument. For example: cam = ipcam('http://172.28.17.193/video.mjpeg', 'UserName', 'PassWord')
Using the Matlab script shown below, we have created a Data store of 1000 images for each object and stored the images into designated file folders. We chose to take the pictures at an interval of one second and the pictures will be stored at specific file folder in the specified URL.
Note: our camera (android phone) does not require user authentication and our code does not include the user name and password arguments. If your camera does, see the vendor manual to create the object with user name and password arguments.
%% Automatic Picture taker
clear all
cam = ipcam('http://172.28.17.193/video.mjpeg') % replace the example IP address
partName = 'FK-testImg%d.png'; % base file name
figure
for i = 1:1000 % take a thousand pictures
im = snapshot(cam); % Take a picture
image(im); % Show the picture
im = imresize(im,[227 227]); % Picture size for Alexnet CNN
path = sprintf(partName,i);
realPath = ['C:\xxx\xxx\xxxx\ImageCapturingTest\' path]; % replace the example file path
imwrite(im, realPath); % writes the image to the folder.
pause (1); % one picture every second
end
%%
After creating the DataStore
of images, we used the following code to perform a successive transfer learning and validation test on our objects. This code is a modified version of the example code from MathWorks website.
%{
Team OCS Transfer Learning
Based on Mathworks example codes
%}
imds = imageDatastore('NewImageFiles', ...
'IncludeSubfolders',true, ...
'LabelSource','foldernames');
[imdsTrain,imdsValidation] = splitEachLabel(imds,0.7);
FKnet = googlenet;
analyzeNetwork(FKnet);
FKnet.Layers(1);
inputSize = FKnet.Layers(1).InputSize;
if isa(FKnet,'SeriesNetwork')
lgraph = layerGraph(FKnet.Layers);
else
lgraph = layerGraph(FKnet);
end
[learnableLayer,classLayer] = findLayersToReplace(lgraph);
[learnableLayer,classLayer];
numClasses = numel(categories(imdsTrain.Labels));
%%
if isa(learnableLayer,'nnet.cnn.layer.FullyConnectedLayer')
newLearnableLayer = fullyConnectedLayer(numClasses, ...
'Name','new_fc', ...
'WeightLearnRateFactor',10, ...
'BiasLearnRateFactor',10);
elseif isa(learnableLayer,'nnet.cnn.layer.Convolution2DLayer')
newLearnableLayer = convolution2dLayer(1,numClasses, ...
'Name','new_conv', ...
'WeightLearnRateFactor',10, ...
'BiasLearnRateFactor',10);
end
%%
lgraph = replaceLayer(lgraph,learnableLayer.Name,newLearnableLayer);
%%
newClassLayer = classificationLayer('Name','new_classoutput');
lgraph = replaceLayer(lgraph,classLayer.Name,newClassLayer);
%%
layers = lgraph.Layers;
connections = lgraph.Connections;
layers(1:10) = freezeWeights(layers(1:10));
lgraph = createLgraphUsingConnections(layers,connections);
pixelRange = [-30 30];
scaleRange = [0.9 1.1];
imageAugmenter = imageDataAugmenter( ...
'RandXReflection',true, ...
'RandXTranslation',pixelRange, ...
'RandYTranslation',pixelRange, ...
'RandXScale',scaleRange, ...
'RandYScale',scaleRange);
augimdsTrain = augmentedImageDatastore(inputSize(1:2),imdsTrain, ...
'DataAugmentation',imageAugmenter);
augimdsValidation = augmentedImageDatastore(inputSize(1:2),imdsValidation);
options = trainingOptions('sgdm', ...
'MiniBatchSize',10, ...
'MaxEpochs',6, ...
'InitialLearnRate',1e-4, ...
'Shuffle','every-epoch', ...
'ValidationData',augimdsValidation, ...
'ValidationFrequency',3, ...
'Verbose',false, ...
'Plots','training-progress');
%%
%%
TestNetwork = true;
if TestNetwork
FKnet = trainNetwork(augimdsTrain,lgraph,options);
end
%%
%%
[YPred,probs] = classify(FKnet,augimdsValidation);
accuracy = mean(YPred == imdsValidation.Labels)
%%
% idx = randperm(numel(imdsValidation.Files),8);
% figure
% for i = 1:8
% subplot(4,4,i)
% I = readimage(imdsValidation,idx(i));
% imshow(I)
% label = YPred(idx(i));
% title(string(label) + ", " + num2str(100*max(probs(idx(i),:)),3) + "%");
% end
According to MathWorks, transfer learning is a Deep Learning technique in which a CNN that has been trained is used as a starting point to re-train our new model to solve our problem. Transfer learning helps us to train our model using few images but based on a pre-trained model with over a million images. By doing so, we can reduce the training time and computing storage. Likewise, after creating a Data Store with five categories (DasaniWaterBottle, FK-3D-Part, IronBottle, NewYorkSoda, and Sunkist) with 44 pictures each, we have performed the transfer learning with different training options. Below is one of the best results of successive tests along with its confusion matrix. The model shown in figure 2 is re-trained with six epochs, 10^-4 learning rate, and 90 iterations on a single GPU.
The result of this model has been improved to a learning accuracy of 100% with GPU processing time of 8 minutes and 31 seconds. Besides, as it shown in figure 2, the model has a zero-confusion matrix. Likewise, the classification accuracy has also improved, only one object is classified with accuracy of 98.6% which is below our target.
After the second transfer learning, the model exhibits a 0% confusion, which is more important than the classification accuracy, and over 99% classification accuracy except for one instance.
The confusion matrix is plotted using the "plotconfusion
" function as shown below. The classification code is also shown below.
%% confusion plot
plotconfusion(imdsValidation.Labels, YPred);
%% Classification
idx = randperm(numel(imdsValidation.Files),9);
figure
for i = 1:9
subplot(3,3,i)
I = readimage(imdsValidation,idx(i));
imshow(I)
label = YPred(idx(i));
title(string(label) + ", " + num2str(100*max(probs(idx(i),:)),3) + "%");
end
%%
The result of this model has been improved to a learning accuracy of 100%; however, one object is classified with an accuracy of 98.6% which is below our target. Hence, we still have to re-train and fine tune our model with more images to improve the classification accuracy. The learning accuracy (100%) and the confusion matrix are good to go. Therefore, we can keep the current training options as-is, but we have to increase the number of images in our data store. This way, we can improve the classification accuracy to over 99%.