Experiment

Data Description

Udacity Annotated Driving Dataset 1

The dataset includes driving in Mountain View California and neighboring cities during daylight conditions. It contains over 65,000 labels across 9,423 frames collected from a Point Grey research cameras running at full resolution of 1920x1200 at 2hz. The dataset was annotated by CrowdAI using a combination of machine learning and humans.

Labels

Car
Truck
Pedestrian

CSV Format

xmin
ymin
xmax
ymax
frame
label
preview url for frame

Data Source Link:

https://github.com/udacity/self-driving-car/tree/master/annotations

Implementation

Detection Settings

Python 3
Numpy
OpenCV Python

Method

We use OpenCV DNN (Deep Neural Network) module as running inference on images with YOLO models and configuration files. We can see the result as follow:

Training Settings

Darknet
MSI GTX 1070 (1 GPU)
720 images of Udacity Annotated Driving Dataset 1
Yolov3-tiny weight and configuration file

Training steps:

Convert Udacity annotation format into YOLO format
Set the following parameter in yolov3-tiny.cfg:
- set batch=24
- set subdivisions=8
- set filters=(3 + 5)*3 = 24
- set classes=3
Train yolov3-tiny.weights with our dataset using Darknet library
Train for ~5 days until average loss error < 0.06