Methodology

Procdures

Reimplementing the YOLO v3 algorithm and Deep SORT algorithm
Transfer learning on YOLO model
Evaluation of the YOLO model
Test the YOLO model and Deep SORT model

Re-IMplementation

Implementation settings:
- Language and library
  - Python 3.7
  - Tensorflow 2.3
  - Opencv-python 4.1

Original dataset:
- COCO (Common Objects in Context) dataset

Test on images and videos
- Detection with YOLO model using the pre-trained weight
- Tracking with Deep SORT model using the results of YOLO model as the input

Transfer learning

The objective of transfer learning is to train a model with better performance of detecting the traffic participants. The model is initialized with the pre-trained weights.

The training steps are:

Initialize the model with pre-trained weights
- Specify the chosen classes
  - classes = [‘bus’, ‘car’, ‘person’, ‘traffic light’, ‘traffic sign’, ‘truck’]
- Select new dataset for training and testing
  - new dataset: Google Open Images Dataset v6
  - 6000 image for training and 1200 images for testing
- Run the training process

The loss function of YOLO model equals the summation of the following three losses: regression loss, confidence loss, and classification loss.

Regression loss is calculated only when the prediction box contains the objects.
Confidence loss determines whether there is an object in the prediction frame.
Classification loss determines which category the object in the prediction frame belongs to.

Fig 5. Loss function

The following figures represent the three losses and the total loss of the training set and the validation set during the model training process.

Fig 6. Losses of the training set and the validation set

EVALUATION

After the model training is completed, we are going to evaluate the model. The main criteria we are using is mean Average Precision (mAP).

To calculate mAP, we should first understand the concept of Intersection over Union (IOU). IOU measures the overlap between the predicted bounding box and the ground truth bounding box. A certain threshold value will be set for IOU.

Fig 7. IOU calculation

With the observed value and the threshold value of IOU, True Positive (TP), False Positive (FP), and False Negative (FN), which are the three elements of a confusion matrix, can be calculated.

- True Positive (TP): If the IoU of is greater than the threshold value
- False Positive (FP): If the IoU of one object is greater than the threshold value
- False Negative (FN): If the ground truth is present in the image while the object is not detected

Then, the Precision and Recall for each object can be calculated.

Precision = TP/(TP+FP)
Recall = TP/(TP+FN)

Calculating the mean value of the Precision of each class, we can get the overall mAP of the model.

Parameter sensitivity testing

The two parameters we are focusing on are the confidence threshold and the IOU threshold. The confidence threshold represents the minimum confidence that the model has on a detected object. The IOU threshold represents the overlap between the predicted bounding box and the ground truth bounding box.

Page updated

Report abuse