GOH JUN HUI
Background of Our Project
The objective of our project was to make a robotic arm do a daily task. The daily task which our group chose to have the robotic arm perform was to sort thrash. In 2018, Let’s do It Foundation started on a project to create an AI that is able to detect thrash using computer vision. Their AI WADE (Waste Detector) later let them win the 2018 UNESCO-Japan Prize on Education for Sustainable Development. Their AI tool is able to identify the position of the waste in the image as well as the material the waste is made of. The AI does so more accurately and faster than us humans. Therefore, based on this project we decided that we wanted to do something similar to their project but implementing it on a robotic arm. We know that in many countries its citizens are required to sort their trash before they dispose of them. However, many people do not do so simply because they find it mundane and troublesome. So, by having this robotic arm, it helps humans do their job of sorting their trash and help make the recycling process easier. We could do the project by implementing computer vision and an object detection AI.
Elaboration (Process)
During the first week of the attachment, we were watching some tutorials on udemy for basic python, reading on tutorials for OpenCV and reading the manual for the Universal Robot (UR). We had to do so as we are required to implement these python libraries in our code. We started off first by recapping on our basic python syntax. The last time we coded using python was a really long time ago, so we had to recap on the python syntax. OpenCV is a python library that is used for computer vision and image processing. We also needed to use the python library urx, urx is a python library that has functions specifically to move the UR robot, we need it to be able to move the robot the way we want it to.
On the second week, we started to experiment with the UR Robot. We wrote a code that makes the robot do a simple pick and place task. At the start of the project, I wanted to do object detection by comparing two images, initial image and current image. If there is a difference between the current image and the initial image, it means that there is an object detected, but it did not work. I then started to research on ways to do object detection, and I realised that object detection had to be done with an AI. I found out that I could do object detection using an open source AI, YOLOv3. The reason why I wanted to use YOLOv3 was because it was one of the fastest object detection AI that was available to me. I needed my AI to be fast as I wanted the object detection to be real time. However, after I used the YOLOv3 algorithm, I realised that even though it was fast, my computer was too slow and the object detection took a few seconds. After which, I dropped the idea of having a real time object detection.
On the third week, we started to work on our first prototype. Our first prototype would be using the YOLOv3 algorithm for object detection. From the output of the algorithm we would be able to identify the position of the object and what the object is. The robotic arm will then grab the object based on position of the object detected. The robotic arm will then pick it up and dispose of it. The position of the object can be determined through the position of the object on the image. I separated the image into a 4x4 grid, depending on which section the centre of the object lies in, the robotic arm will move to the corresponding area to pick it up. Although this prototype worked, there are still many limitations to this prototype, such as the type of trash it picks up. Because this prototype uses the YOLOv3 algorithm, the dataset which it was trained with is the coco dataset. The coco dataset consists of images which were mainly of daily object, animals and humans, so YOLOv3 might not be able to identify many trashes. Out of the objects which YOLOv3 is trained on the only trash is bottle. Thus, this prototype was limited to only detecting bottles as trash and only able to pick up bottles.
On the last week, we decided to have a collaboration with SCEI003. The purpose of this collaboration is to tackle the limitations of the YOLOv3 algorithm that we are currently using. We worked together to help train a model to be able to do object detection specifically on trash. The algorithm that we create will hopefully be able to classify the material of the trash and the position of the trash. The trash can mainly be classified into the categories metal, glass, plastic, paper and others. One of the main difficulties which we have to overcome is our limited training dataset. Usually, machine learning might have up to tens of thousands of images as a training dataset, it will be time consuming for us to find so many images and label them individually. Thus, we decided to obtain our training dataset from an online source instead of creating it by oursves. Although we manged to find a training dataset online, the dataset had only 2000 images and the model trained would unlikely be very accurate. But, in the end we manged to have an 18000 image dataset by creating 8 other variations of the same photo. However, due to technical difficulties we were unable to test out this model with the UR Robot as we were unable to download the python library tensorflow for the model.
Elaboration (Final Product)
For our final product we have to create a video introducing our final prototype and a poster to explain our prototype. At the end of the attachment although we do not have a video on how our trash sorting robot works, but we have already completed the code for the implementation of the trash sorting AI and the movement of the robots. Thus, theoratically if we were able to test with the robotic arm we will be able to run the code with a computer that can run tensorflow. This is because we have tried to use the same code with YOLOv3 and it works, this time we are only changing the AI model and everything remains the same so it should be able to work.
The learning objective was for us to have some exposure to the robotic arm and some coding. I feel that we have met these learning objectives as we have a better understanding of how computer vision and the robotic arm works and we were able to create our final prototype ourselves by writing the code. Even though there are still some areas of improvement for our final project, I feel that we learnt a lot of content through research and watching tutorials online regardless of how our final prototype turned out.
3 content knowledge / skills learnt
1. I learnt about Object Detection AI algorithms especially YOLO. There are mainly 3 types of Object detection AIs Fast-RCNNs, Single Shot Detectors(SSD) and You Only Look Once(YOLO). Out of these 3 object detection AIs Fast-RCNNs are considered to be the slowest followed by SSD then YOLO. However, in terms of accuracy Fast-RCNNs are the most accurate followed by SSD then YOLO. Thus, SSD is like a balance between speed and accuracy. the reason behind the differnce in speed is that Fast-RCNNS are two-stage detectors while YOLO and SSD are one-stage detectors. One-stage detectors are generaly faster but less accurate than two-stage detectors. There are limitations to YOLO as well. YOLO is unable to detect object that are too small or if the objects are in a group. The reason behind this limitation is that when YOLO does object detection, it will divide the input image into a grid, within a grid YOLO will only be able to detect one object. So, if there are many objects in a certain grid, YOLO will not be able to detect the object.
2. I learnt problem solving skills during my time at the attachment. During my attachment period, there was a proffesor from SUTD, Mr Jianxi Luo, that came over to our office to give everyone a talk. Durting his talk he shared with us about his experience on solving engineering problems. He was tasked to help design a bus. In the process of designing he faced many problems. In the end he solved the problems by using his knowledge from another field onto his problem. This has taught me that when I am solving a problem I cannot be too fixated on the knowledge or content from the particular topic, maybe I can use the content or knowledge I learnt from other topics and apply it to the problem. This has changed the way that I look at problems and how I overcome it. I realised that I have to more creative to come up with solutions to the problems.
3. I learnt how to code with python library OpenCV. OpenCV is a python library that enables the user to be able to utilise computer vision and perform image processing. It can be used to capture images from a camera and apply some image processing functions on it such as drawing circles, squares and ovals on the image, making the image black and white or only detect a certain colour in the image. All of these can be done in built functions in OpenCV. It also has a Deep Neural Network module that allows the user to easily implement his neural network into his code.
2 Interesting aspects of my Learning
1. Before the start of the attachment, I did not really like coding and did not find that it was interesting. I felt that it was very complicated and there was a lot of information to process. It did not help that the tutorials online had a lot of information as well. However, after a while of digesting all the information from the online tutorials, I feel that coding was not that hard after all. I felt that this is very interesting as I had a change in the way I view coding. Now, I feel that coding is not as difficult as it seems and it is actually very logical. As long as we are open to understand python and take the time to slowly understand python, python will not be as difficult as it is at the start. Thus, it was really interesting for me.
2. Another interesting aspect of my learning is the way we learn during the attachment. In school, when we learn it would always be just us listening to the teacher after we have done some primary research on our own, so the main content is still being taught by the teacher. However, during this attachment we had to learn everything on our own, from how to use OpenCV to the different functions in urx. We had to do all the learning by ourselves from google. This made me realised that as we move on to the workforce there will not be a teacher there to guide you every step of the way and we have to learn everything independently through googles or other means.
One Takeaway for Life
One takeaway that I had was that I realised it is really important for us to have passion in our work in order for us to do well and to enjoy work. If one is passionate in what we do we, even if we face difficulties and challenges in the process, we will be motivated to overcome these challenges and achieve our goal. With passion in our work, we will find our work enjoyable rather than complaining every day about the work we have to do and making our life miserable. This will ensure that we spend our time at our workplace more meaningfully.