Reimplementing the YOLO v3 algorithm and Deep SORT algorithm
Transfer learning on YOLO model
Evaluation of the YOLO model
Test the YOLO model and Deep SORT model
Implementation settings:
Language and library
Python 3.7
Tensorflow 2.3
Opencv-python 4.1
Original dataset:
Test on images and videos
Detection with YOLO model using the pre-trained weight
Tracking with Deep SORT model using the results of YOLO model as the input
The objective of transfer learning is to train a model with better performance of detecting the traffic participants. The model is initialized with the pre-trained weights.
The training steps are:
Initialize the model with pre-trained weights
The loss function of YOLO model equals the summation of the following three losses: regression loss, confidence loss, and classification loss.
Regression loss is calculated only when the prediction box contains the objects.
Confidence loss determines whether there is an object in the prediction frame.
Classification loss determines which category the object in the prediction frame belongs to.
Fig 5. Loss function
The following figures represent the three losses and the total loss of the training set and the validation set during the model training process.
Fig 6. Losses of the training set and the validation set
EVALUATION
After the model training is completed, we are going to evaluate the model. The main criteria we are using is mean Average Precision (mAP).
To calculate mAP, we should first understand the concept of Intersection over Union (IOU). IOU measures the overlap between the predicted bounding box and the ground truth bounding box. A certain threshold value will be set for IOU.
Fig 7. IOU calculation
With the observed value and the threshold value of IOU, True Positive (TP), False Positive (FP), and False Negative (FN), which are the three elements of a confusion matrix, can be calculated.
True Positive (TP): If the IoU of is greater than the threshold value
False Positive (FP): If the IoU of one object is greater than the threshold value
False Negative (FN): If the ground truth is present in the image while the object is not detected
Then, the Precision and Recall for each object can be calculated.
Precision = TP/(TP+FP)
Recall = TP/(TP+FN)
Calculating the mean value of the Precision of each class, we can get the overall mAP of the model.
Parameter sensitivity testing
The two parameters we are focusing on are the confidence threshold and the IOU threshold. The confidence threshold represents the minimum confidence that the model has on a detected object. The IOU threshold represents the overlap between the predicted bounding box and the ground truth bounding box.