Implementation

The source code for this project is available on GitHub.

1. Android's Camera2 API

The Camera2 API in Android exposes the camera as a pipeline periodically making images available to users via the ImageReader class. We use the sample code provided by Android and Tensorflow Lite for interacting with the camera and pre-processing the frames. Pre-processing is mainly just converting frames (which are acquired in YUV format) to RGB, as the CNNs we use assume that format for their input. Further steps involve cropping/resizing the image to be the right size for the CNN. For YOLO, a further step is to scale the pixel values to the range [0, 1] by dividing them by 255.

Once a frame has been acquired, it is passed to the object detection model, which performs its computations in a background thread, so that it doesn't freeze the UI.

2. Tensorflow Lite

Tensorflow Lite (TFLite) is a lightweight library for performing on-device inference on platforms like Android and iOS. It includes utilities to convert a trained model in Tensorflow to the .tflite format, which is a compressed flat binary file that can be used on mobile devices. The library's interface is similar to that of Tensorflow itself. We create input and output arrays of the required sizes before calling the inference function in TFLite, which fills in the output array according to the model.

2.1 Quantization

TFLite provides utilities to quantize model parameters to 8 bits before deploying to the device. It still maintains activations in floating point as far as we understand, though this may change soon as it is under active development. Our experiments with quantized weights didn't work well, probably because such a change also needs other adjustments in terms of rescaling the inputs, etc.

3. Model files

We use pre-trained models for this project. SSD-MobileNet is the model used in the TFLite examples, so it is already available in .tflite format. YOLO is developed in Darknet, a C library with its own weights file format. We use DW2TF, which converts Darknet weights to Tensorflow, which we further convert to .tflite using tflite_convert.

Weights file conversion is sometimes tricky, as both Tensorflow and Tensorflow Lite may refuse to work depending on subtle problems with the issued commands. In Tensorflow Lite's case, some functionality from Tensorflow is not implemented yet, so some models do not work directly yet (we had this problem with the "standard" YOLO v3 model).

Report abuse