Conclusions
Overall, we find out that transfer learning is efficient and effective since we do not need to train from scratch. And the transfer learning can improve the detection performance of some classes in the chosen dataset.
Problems
The major problem is that the generalizability is not very good. Once we use a new dataset for testing, misclassification may happen, and the confidence score will generally be lower. The training process will take a lot of time if GPU is not used for training.
In addition, it's not realistic to do real-time detection and tracking on our machine. If the GPU is used for the testing, the running speed of the program is about 15 frames per second (FPS). If a CPU is used instead of a GPU, the running speed is only about 5 FPS. The current running speed of the program is not fast enough for analyzing the data collected by the cameras on the autonomous vehicles with at least 30 frames per second.
Future work
For the next step, we are going to train with more data to improve the overall performance. Also, we are going to train on different datasets to improve the generalizability. If time is allowed, we will use video data for the training, which contains data of continuous actions.