All the codes of three models and the trained weights are provided in the Github repository. All models were trained in Nvidia GTX 1080 in Euler, the computing cluster of Wisconsin Applied Computing Center in UW-Madison.
We trained the Faster R-CNN model the model provided by Chainer-CV and the loss curve were as shown below:
We trained the SSD model with Keras and Tensorflow as the backend computing engine. The original code is provided by ssd_keras project. The training loss is shown below:
We trained the YOLO model with Keras and Tensorflow as the backend computing engine. The original code is provided by keras-yolo3 project. The training loss is shown below:
The left one is the ground truth while the one on the left is prediction, the accuracy of bounding box location for this model is pretty high.
The left one is a bad detection example while a the one on the right is a good example.
In our case this model also made a very precise prediction of object, the example under is almost exactly hitting the label bounding box.
In this step, for each patient, we introduce a matrix for bounding boxes of all images belonging to him from all three models.
And voting matrix examples for three test patients are shown below:
Notice that light points in the matrix above means IOU of different bounding boxes, the lighter means the value of IOU is larger. After voting we got a pretty satisfying result that we got all correct bleeding sites for all patients. Comparing to the three result (especially for SSD), our result is highly precise for all patients, and some SSD prediction also used in final voting, so it is not the case that other two model dominates the result.