Experiment

Data Description

Udacity Annotated Driving Dataset 1

The dataset includes driving in Mountain View California and neighboring cities during daylight conditions. It contains over 65,000 labels across 9,423 frames collected from a Point Grey research cameras running at full resolution of 1920x1200 at 2hz. The dataset was annotated by CrowdAI using a combination of machine learning and humans.

Labels

  • Car
  • Truck
  • Pedestrian

CSV Format

  • xmin
  • ymin
  • xmax
  • ymax
  • frame
  • label
  • preview url for frame

Implementation

Detection Settings

  • Python 3
  • Numpy
  • OpenCV Python

Method

We use OpenCV DNN (Deep Neural Network) module as running inference on images with YOLO models and configuration files. We can see the result as follow:

Training Settings

  • Darknet
  • MSI GTX 1070 (1 GPU)
  • 720 images of Udacity Annotated Driving Dataset 1
  • Yolov3-tiny weight and configuration file

Training steps:

  • Convert Udacity annotation format into YOLO format
  • Set the following parameter in yolov3-tiny.cfg:
    • set batch=24
    • set subdivisions=8
    • set filters=(3 + 5)*3 = 24
    • set classes=3
  • Train yolov3-tiny.weights with our dataset using Darknet library
  • Train for ~5 days until average loss error < 0.06