UHM Cyber Security and Application Research Lab

Reports

Report 1: February 4, 2018

Mace:

Took pictures this past couple weeks, at multiple locations at UH campus, of Haole Koa trees. There are about 222 unique images of the Haole Koa tree at varying distances to increase our sample size to practice with. Hoping to run some tests next using our currents samples and then later on be able to install the necessary software to replicate these tests on other windows PCs.

Evan:

I prepped the workstation in Holmes 390 with the latest Windows 10 installation and got all the prerequisites installed for remote control use and Computer Vision applications. Performed some TensorFlow projects using the dataset on ImageNet as a test to compare accuracies for different types of objects in general. Compared the performance of OpenCV and saw that TensorFlow would maximize time and also create a classification much simpler than OpenCV. We can perform multiple experiments quicker everyday than OpenCV which can take about half a day to setup an experiment for just one object.

Glen:

Did research on existing datasets, models, and training methods in order to more quickly create applicable proofs-of-concept. Prepped XPS workstation to complete 1 experiment before Evan’s re-installations.

Experiment 1:

- Inputs: ImageNet Haole Koa images http://image-net.org/synset?wnid=n11762433 and Waiawi images (also image-net)
- Model: TensorFlow’s Inception V3
- Training Steps: 4000 (Retrained top layer only)
- Results:
  - After 3.5 minutes of training on the XPS workstation, achieved 91.9% accuracy on training data
  - 6 images downloaded off of Google Images (3 haole koa and 3 waiawi) were shown as-is to the classifier. The classifier answered correctly 5 out of 6 images with a certainty higher than 99% on the correct images

While this is only image classification, we need object detection in order to take an image or video with multiple classes of objects in the same frame (haole koa in the forest amongst multiple species of vegetation). Therefore, now researching retraining TensororFlow’s Object Detection API. Results should be available by next week.

Report 2: February 18, 2018

Evan:

Identified Hawaiian plants that are commonly found along with Haole Koa and started gathering an image dataset for these plants identified. There was a project that identified Hawaiian plants that was started by KCC and LCC and contact to the project managers was started. We would like to be able to access their image dataset if that is a possibility and be able to collaborate with them at the same time offer something back to them in return for their dataset. We are also trying to establish a point of contact with someone on that team that is willing to work with us on the project as a guide to help identify and label Hawaiian plants. We are also working with Joshua on identifying reef in Hawaii and have started contacting him to establish an initial in-person meeting or VTC to be able to discuss more about his goals and also what kind of additional image datasets that we will need to assist us in the project. We currently have over 1,000 images total in a combined dataset that we have amassed from Image-net and are in the process of cleaning out "dirty" images that aren't of that plant in general. A Python script used in the previous semester that I have made was used to clean out most of the common easy "dirty" images. Mace will be sorting out the dirty images. Glen will be starting on a test sample and documentation for everything will be written up so that each member of the team and people outside the project can easily replicate the project and results. We have identified that the GPU on the workstation will work with CUDA which we will need to speed up training in TensorFlow.

We have identified on using TensorFlow instead of OpenCV due to the quick training time and very accurate results that we can obtain on a limited image dataset. OpenCV doesn't have much CNN support which we will like to use to get more in-depth of the project and TensorFlow already has a variety of pre-trained classifiers that we can use. We have agreed on implementing a role rotation every two weeks to be able to experience every aspect of the project and not just specialize on an aspect of the project.

Mace:

Cleaning out "dirty" image sets to be used to train and classify common plants with Haole Koa plant. Going to record video of Haole Koa with other known species of plants to be used as a test to identify the Haole Koa from other species in a moving video to test at which distance we are able to accurately identify Haole Koa. Also attempting to learn how to build a TensorFlow image classifier to be able to grasp an understanding of the process and logic to be able to document for later use.

Report 3: March 4, 2018

Evan:

Gathered additional data sets of Hawaiian plants to add to the classifier to better classification of images. After looking at the documentation for TensorFlow, it was found to be able to do some basic image filtering for us such as flipping the picture horizontal or vertical, skewing the pictures or applying some brightness and cropping to the current pictures to add along to the data set. For example we can at least double the size of the data for classification and training. The more images we have as samples the more accurate the classifier can get. The weakness that I have identified is that without much samples of Hawaiian plants, the classifier can be easily thrown off as misclassifying an image as something else. Somethings mentioned in the documentation is training step size which by default is 4000 steps but we can keep increasing the steps to create an accurate model. We will need to graph the data to be able to determine the best optimal step size as overstepping the training can cause it to be inaccurate as well. There is also epoch which we will need to look more into and we feel that we can make the classifier much better by understanding training steps and epoch number.

Glen has been looking into object classification which is another step into image classification but much harder since we need to be able to create bounding boxes to define. Looking into a solution, a lot of documentation and trend is going into YOLO (You Only Look Once) which uses a CNN but requires a CUDA compatible graphics card which the current workstation is the lowest CUDA supports. I will be attempting the implementation of YOLO while Glen implements a general version of it and we can compare the performance or simplicity of each implementation. This is the step that is holding us back which is difficult to get running after watching and following several tutorials and getting stuck in between each of them because of various issues encountered.

Mace:

This past week I cleaned out more datasets to be used for training. These datasets were for the birds nest fern, candlenut, lace fern, and papaya tree. Cleaning is referred to as the process where I take unusable or inaccurate pictures for training. Adding more datasets like these to the classifier allows us for more accurate classifying by being able to separate species from each other, rather than getting false positives or getting inaccurate results.

I’ve also tried getting the trainer and classifier to work on windows 10 operating system. However, I’ve been having difficulties linking the dataset to the container used for training the classifier. In the next weeks, I want to try fixing this, but if there is no solution, then I’ll try setting up a VM for Ubuntu and then setting it up there.

Page updated

Report abuse