This project, Object Classification and Sorting (OCS) with Deep Learning and Robotic Manipulator, is intended to classify objects into their category and sort them accordingly. The project will utilize deep learning algorithm, a laptop computer, web camera, APC220 RF communication module, micro controller, motor shield, and a robotic manipulator as shows in figure 1a.
1- Classify object categories with Matlab deep learning and pre-trained convolutional neural network, CNN.
2- Improve the classification accuracy by transfer learning.
3- Test classification accuracy on selected objects.
3- Send the result from Matlab to a robotic manipulator with Serial communication.
4- Sort (Pick and place) objects to their designated area with a robotic manipulator.
5- Reject unknown objects (Rejection mechanism is not shown on the top level architecture).
In this project, the core element is the Matlab Deep Learning algorithm that teaches the computer to do what humans naturally do: learn from successive trainings and from experience. By using Deep learning algorithm, we can solve different engineering problems such as object classification and sorting, motion detection, autonomous driving, lane detection, pedestrian detection, autonomous parking...etc. However, this project emphasizes only on the object classification and sorting application of the Deep Learning.
To understand the basics of Deep Learning, we took a self-paced online training called “Deep Learning Onramp” from MathWorks. Then, by customizing the provided Deep Learning code in the training and using GooLeNet pre-trained Convolutional Neural Networks (also called CNN), we are able to classify basic objects like water bottle, keyboard, mouse, …etc, into their corresponding categories as shown in figure 2.
The Matlab code in figure 2 shows how to connect a camera object, load a pre-trained CNN, continuously take pictures, resize the pictures and classify an object based on the selected CNN, in our case GoogLeNet. Using the pre-trained CNNs like GoogLeNet or Alexnet, some basic objects, like water bottle, can be easily classified with a higher accuracy as shown in figure 2. In our case, the water bottle is predicted with an accuracy of 98.8%. However, when we try to classify an apple, the highest prediction accuracy is 75% and there is a higher prediction confusion as a Pear or Pomegranate in lieu of the apple.
This prediction confusion can be mitigated by creating a new Data Store of images related to our problem and by re-training the CNN with the new images (this process aka transfer learning). Before performing a transfer learning, we need to select a CNN that fits to our need. MathWorks has also provided the following chart to compare the validation accuracy of different CNNs along with the GPU time required to make a prediction.
Note: if your computer doesn't have a GPU, using with a CPU for transfer learning is time consuming and tedious.While choosing a CNN, there will be a trade-off between network accuracy, speed, and size. Another limitation while trying to perform a transfer learning is prediction time on a CPU is extremely high. However, using a Graphics Processing Unit (GPU) instead of CPU will make the transfer learning much faster.