CSULB Capstone Project - Phase -2 Deep Learning

Phase 2

Deep Learning & Transfer Learning |Week 3-5|

Writing Deep Learning Algorithm in matlab
Appling pre-trained networks directly to classification problems
Selection and Evaluation of a Pre-trained CNN (AlexNet, GoogLeNet, densenet201, …)
Creating new data store of images for transfer learning
Transfer learning

Transfer Learning Using AlexNet and GoogLeNet

The purpose of the transfer learning is to fine-tune a pretrained convolutional neural network to perform classification on our new collection of images.

According Mathworks, "AlexNet has been trained on over a million images and can classify images into 1000 object categories (such as keyboard, coffee mug, pencil, and many animals)." However, we are looking for a classification accuracy over 99% on a given objects. Transfer learning is commonly used in deep learning applications. We can take a pretrained network and use it as a starting point to learn a new task. Fine-tuning a network with transfer learning is usually much faster and easier than training a network with randomly initialized weights from scratch. We can quickly transfer the learned features onto a new task using a smaller number of training images.

Acquire Images from IP Camera (smartphone)

In order to perform a transfer learning, we have to create an image data store. Taking thousands of pictures with the traditional method will be time consuming and inefficient. Yet, this can easily be performed by using our smartphone as a webcam and the Matlab snapshot function to acquire one image frame at a time from an IP camera using a for loop. Also, imwrite(A,filename) writes image data A to the file specified by filename, inferring the file format from the extension. Once the IP cam is connected, we can take continues pictures automatically by writing a simple looping Matlab script.

First, obtain the IP address. The URL is made up of the IP address of the camera, followed by a resource designation, such as video.mjpeg or mjpg.cgi, as defined by the camera vendor. For example:

cam = ipcam('http://172.28.17.104/video/mjpg.cgi')

Connect the smartphone IP camera by creating an object, cam, using the URL of the IP camera as shown above.
Optionally, Username can be created as a character vector representing the user name for the IP camera, and must be the second argument.
Then, Password can be created as a character vector representing the password for the IP camera, and must be the third argument.

For example: cam = ipcam('http://172.28.17.193/video.mjpeg', 'UserName', 'PassWord')

After creating the object you can preview the image and take snapshots from the camera.

Creating Image DataStore:

Using the Matlab script shown below, we have created a Data store of 1000 images for each object and stored the images into designated file folders. We chose to take the pictures at an interval of one second and the pictures will be stored at specific file folder in the specified URL.

Note: our camera (android phone) does not require user authentication and our code does not include the user name and password arguments. If your camera does, see the vendor manual to create the object with user name and password arguments.

%% Automatic Picture taker

clear all

cam = ipcam('http://172.28.17.193/video.mjpeg') % replace the example IP address

partName =  'FK-testImg%d.png'; % base file name

figure

for i = 1:1000  % take a thousand pictures

    im = snapshot(cam);       % Take a picture

    image(im);                   % Show the picture

    im = imresize(im,[227 227]); % Picture size for Alexnet CNN

    path = sprintf(partName,i);

    realPath = ['C:\xxx\xxx\xxxx\ImageCapturingTest\' path]; % replace the example file path

    imwrite(im, realPath); % writes the image to the folder.

    pause (1); % one picture every second

end

%%

After creating the DataStore of images, we used the following code to perform a successive transfer learning and validation test on our objects. This code is a modified version of the example code from MathWorks website.

https://www.mathworks.com/help/deeplearning/examples/transfer-learning-using-alexnet.html

First attempt using GoogLeNet

%{

Team OCS Transfer Learning

Based on Mathworks example codes

%}

imds = imageDatastore('NewImageFiles', ...

    'IncludeSubfolders',true, ...

    'LabelSource','foldernames');

[imdsTrain,imdsValidation] = splitEachLabel(imds,0.7);

FKnet = googlenet;

analyzeNetwork(FKnet);

FKnet.Layers(1);

inputSize = FKnet.Layers(1).InputSize;

if isa(FKnet,'SeriesNetwork')

  lgraph = layerGraph(FKnet.Layers);

else

  lgraph = layerGraph(FKnet);

end

[learnableLayer,classLayer] = findLayersToReplace(lgraph);

[learnableLayer,classLayer];

Transfer the layers to the new classification task

numClasses = numel(categories(imdsTrain.Labels));

Increase the WeightLearnRateFactor and BiasLearnRateFactor values

if isa(learnableLayer,'nnet.cnn.layer.FullyConnectedLayer')

    newLearnableLayer = fullyConnectedLayer(numClasses, ...

        'Name','new_fc', ...

        'WeightLearnRateFactor',10, ...

        'BiasLearnRateFactor',10);

elseif isa(learnableLayer,'nnet.cnn.layer.Convolution2DLayer')

    newLearnableLayer = convolution2dLayer(1,numClasses, ...

        'Name','new_conv', ...

        'WeightLearnRateFactor',10, ...

        'BiasLearnRateFactor',10);

end

%%

lgraph = replaceLayer(lgraph,learnableLayer.Name,newLearnableLayer);

%%

newClassLayer = classificationLayer('Name','new_classoutput');

lgraph = replaceLayer(lgraph,classLayer.Name,newClassLayer);

layers = lgraph.Layers;

connections = lgraph.Connections;

layers(1:10) = freezeWeights(layers(1:10));

lgraph = createLgraphUsingConnections(layers,connections);

pixelRange = [-30 30];

scaleRange = [0.9 1.1];

imageAugmenter = imageDataAugmenter( ...

    'RandXReflection',true, ...

    'RandXTranslation',pixelRange, ...

    'RandYTranslation',pixelRange, ...

    'RandXScale',scaleRange, ...

    'RandYScale',scaleRange);

augimdsTrain = augmentedImageDatastore(inputSize(1:2),imdsTrain, ...

    'DataAugmentation',imageAugmenter);

augimdsValidation = augmentedImageDatastore(inputSize(1:2),imdsValidation);

options = trainingOptions('sgdm', ...

    'MiniBatchSize',10, ...

    'MaxEpochs',6, ...

    'InitialLearnRate',1e-4, ...

    'Shuffle','every-epoch', ...

    'ValidationData',augimdsValidation, ...

    'ValidationFrequency',3, ...

    'Verbose',false, ...

    'Plots','training-progress');

%%

Start training the new network with the selected options

%%

TestNetwork = true;

if TestNetwork

   FKnet = trainNetwork(augimdsTrain,lgraph,options);

end

%%

Now, the new network, FKnet can be used for classification

%%

[YPred,probs] = classify(FKnet,augimdsValidation);

accuracy = mean(YPred == imdsValidation.Labels)

%%

Sample Test

% idx = randperm(numel(imdsValidation.Files),8);

% figure

% for i = 1:8

%     subplot(4,4,i)

%     I = readimage(imdsValidation,idx(i));

%     imshow(I)

%     label = YPred(idx(i));

%     title(string(label) + ", " + num2str(100*max(probs(idx(i),:)),3) + "%");

% end

figure 1. Model re-trained with six epochs, 10^-4 learning rate, and 72 iterations on a single GPU.

Second attempt using GoogLeNet

According to MathWorks, transfer learning is a Deep Learning technique in which a CNN that has been trained is used as a starting point to re-train our new model to solve our problem. Transfer learning helps us to train our model using few images but based on a pre-trained model with over a million images. By doing so, we can reduce the training time and computing storage. Likewise, after creating a Data Store with five categories (DasaniWaterBottle, FK-3D-Part, IronBottle, NewYorkSoda, and Sunkist) with 44 pictures each, we have performed the transfer learning with different training options. Below is one of the best results of successive tests along with its confusion matrix. The model shown in figure 2 is re-trained with six epochs, 10^-4 learning rate, and 90 iterations on a single GPU.

Figure 2. Transfer learning with GoogLeNet: The model is trained with six epochs, 10^-4 learning rate, and 90 iterations options on a single GPU.

The result of this model has been improved to a learning accuracy of 100% with GPU processing time of 8 minutes and 31 seconds. Besides, as it shown in figure 2, the model has a zero-confusion matrix. Likewise, the classification accuracy has also improved, only one object is classified with accuracy of 98.6% which is below our target.

Figure 3. Confusion Matrix and Classification Accuracy.

After the second transfer learning, the model exhibits a 0% confusion, which is more important than the classification accuracy, and over 99% classification accuracy except for one instance.

The confusion matrix is plotted using the "plotconfusion" function as shown below. The classification code is also shown below.

%% confusion plot

plotconfusion(imdsValidation.Labels, YPred);

%% Classification

idx = randperm(numel(imdsValidation.Files),9);

figure

for i = 1:9

    subplot(3,3,i)

    I = readimage(imdsValidation,idx(i));

    imshow(I)

    label = YPred(idx(i));

    title(string(label) + ", " + num2str(100*max(probs(idx(i),:)),3) + "%");

end

%%

The result of this model has been improved to a learning accuracy of 100%; however, one object is classified with an accuracy of 98.6% which is below our target. Hence, we still have to re-train and fine tune our model with more images to improve the classification accuracy. The learning accuracy (100%) and the confusion matrix are good to go. Therefore, we can keep the current training options as-is, but we have to increase the number of images in our data store. This way, we can improve the classification accuracy to over 99%.

Objects to be classified by this project:

Known objects will be identified and sorted by the autonomous robotic arm
Water bottle => plastic, Sunkist soda can => metal, and New York Soda => glass .
All other objects will be rejected by the rejection mechanism.

Figure 4. Objects trained by OCS CNN

Prev << Phase -1

HOME

Next >> Phase -3

Google Sites

Report abuse