The perception task for this application is a 2-fold problem:
1. Identify slipstreamable objects
2. Identify Alert Signs, such as stop signs, school limits etc.
We qualified the set of four classes viz. buses, cars, motorbikes and bicycles as objects of interest or slipstreamable objects. We also chose to detect traffic lights and stop signs as alert objects.
For classifying these object we make use of deep neural networks, which are known to work exceptionally well for classification tasks. We reused the existing YOLO (You Only Look Once) Object Detection architecture (image below) and algorithm. The off-the shelf YOLO network when trained with COCO dataset is designed to classify objects of 80 classes. We approached this scheme with transfer learning to restrict classification for only 6 of these classes. Retraining last two layers with frozen weights for the rest gave us an accuracy of 84% with 5 epochs of training.
Once we trained our network to detect slipstreamable objects or alert signs, we condensed the platform for mobile using TensorFlow Lite's TFLite APIs. Following this, we recompiled an app adapted from a reference sample provided by Google to integrate our custom trained model (.pb) with the android application. The network interacts with the app by receiving the input image from the mobile's camera, traversing it through the neural network and returning the detected object class and the top-left and bottom-right x and y coordinates to the android app. Some additional post-processing is made possible by code snippets that track only moving objects rather than stationary objects. The points are then collected by an the TensorFlowYoloDetector android activity and a bounding box is rendered on the screen using Android's canvas API. The reference sample has a CameraActivity which captures images from the phones camera and passes it on the network activity.
Once the Slixstream app's APK was generated, we tested the application for performance around the Virginia Tech campus and obtained some interesting results discussed here. Several iterations and data collection sessions allowed us to reevaluate, reiterate and modify the application for better results. The results helped us understand the need of training the network for more epochs and achieving higher accuracies while training the YOLO network.
1. YOLO
2. MobileNet