Number Tracking

In a football match, the key features that can be used to differentiate individual players are the temporal feature and player locations. Apart from these features, the t-shirt number of each player is one of the important features that can be helpful to identify each player. But, due to nature the video recording the detection of t-shirt number is difficult since the resolution of numbers is low (around 15x15 pixels). Also due to the nature of the game, the t-shirt deformation directly impacts the visibility number. Considering all these challenges, a number detection model is trained with Yolo-V3 as the backbone with a custom dataset.

Digits to Numbers

The dataset (image data) used for Digit detection model is a subset of football data used for feature extraction. The true bounding box for the players is reused to generate initial database consist of cropped images of players from the football match videos. This initial database consists of players in a different orientation, various locations within the football field. The images used in the digit detection model has been handpicked by human annotators within the initial database, and later annotated for different numbers ranging from 0-9 (10 classes).

The trained digit model predicts the digit in the player images and provides the bounding box detail during inference. Using this information, we can identify the digits on a player t-shirt and locate where exactly the digit is in the image, but there is no information regarding the order of digits. For example, when the model predicts digits like 1, 7 and 3, we are not sure of the order. It can be any of the following 13, 17, 73, 37, and so on. Therefore, it becomes necessary to have an algorithm to sort these digits to numbers. The overview of the algorithm is represented in Figure below, where digit 1 is in the pairing zone of 7, while 3 is in the non-paring zone. Hence 7 is paired with 1 for further processing and 3 is considered a single digit in this example.

Linking and Tracking Numbers

Since the digits are identified on the t-shirt of players who are on constant motion, the digits do not always stay parallel to the camera. Also, the orientation of the players keeps changing hence there are great chances that the model will be able to see just one digit on the t-shirt out of two digits. Hence the high confidence number detected do not always match the true number on the t-shirt. There is a high probability of model predict the wrong number with high confidence (false positive) due to deformation of a t-shirt during gameplay.

Also, additional data can be generated by considering previous and current bounding boxes, to determine the relationship between detections and use the result during assigning a number to the corresponding track as shown in Figure (left) for two players. Since the changes in player orientation are very frequent during football gameplay, probabilities of the wrong classification to the detected number is acceptable which are false positives. This error is hard to identify with a single detection, and hence information of previous detections of the same player is necessary to provide some degree of confidence in considering the classification of detected number. Figure (right) shows different numbers detected for 22 players shown with their index on -axis and number of frames on the -axis.

Results: Digit Detection, Digit2Number, Number Tracking

ac5_test.avi