Visual Object Tracking

Introduction

Visual Object Tracking (VOT) is the process of identifying a region of interest in a sequence and consists of four sequential elements, including target initialization, appearance model, motion prediction, and target positioning. Target initialization is the process of annotating object position, or region of interest, with any of the following representations: object bounding box, ellipse, centroid, object skeleton, object contour, or object silhouette. Usually, an object bounding box is provided in the initial frame of a sequence and the tracking algorithm estimates target position in the remaining frames. Appearance modelling is composed of identifying visual object features for better representation of a region of interest and effective construction of mathematical models to detect objects using learning techniques. In motion prediction, the target location is estimated in subsequent frames. The target positioning operation involves maximum posterior prediction, or greedy search. Tracking problems can be simplified by constraints imposed on the appearance and motion models. During the tracking, new target appearance is integrated by updating the appearance and motion models.

Handcrafted and Deep Trackers: Recent Object Tracking Approaches and Trends

Abstract: In recent years visual object tracking has become a very active research area. An increasing number of tracking algorithms are being proposed each year. It is because tracking has wide applications in various real world problems such as human-computer interaction, autonomous vehicles, robotics, surveillance and security just to name a few. In the current study, we review latest trends and advances in the tracking area and evaluate the robustness of different trackers based on the feature extraction methods. The first part of this work comprises a comprehensive survey of the recently proposed trackers. We broadly categorize trackers into Correlation Filter based Trackers (CFTs) and Non-CFTs. Each category is further classified into various types based on the architecture and the tracking mechanism. In the second part, we experimentally evaluated 24 recent trackers for robustness, and compared handcrafted and deep feature based trackers. We observe that trackers using deep features performed better, though in some cases a fusion of both increased performance significantly. In order to overcome the drawbacks of the existing benchmarks, a new benchmark Object Tracking and Temple Color (OTTC) has also been proposed and used in the evaluation of different algorithms. We analyze the performance of trackers over eleven different challenges in OTTC, and three other benchmarks. Our study concludes that Discriminative Correlation Filter (DCF) based trackers perform better than the others. Our study also reveals that inclusion of different types of regularizations over DCF often results in boosted tracking performance. Finally, we sum up our study by pointing out some insights and indicating future trends in visual object tracking field.

Object Tracking and Temple Color Benchmark

OTTC benchmark contains unique sequences from OTB2015 and TempleColor-128 benchmarks. OTTC contains 186 sequences distributed over same 11 challenges from both OTB2015 and TempleColor-128 benchmarks.

Benchmark can be download from this link.

How to use benchmark

The folder of each sequence contains the following files:

  • a directory includes the original image sequence.
  • a groundtruth_rect.txt contains the ground truth. The format of each bounding box is [target_top_left_x,target_top_left_y,target_width,target_height].
  • a [name]_att.txt contains the challenge factors of the sequence.

Evaluation

We have tested 24 trackers over OTTB benchmark. We employed three protocols for performance evaluations including precision, success and speed in frames per second.

Results can be downloaded here.

Speed Trackers Comparison:

  • For a fair speed comparison, all experiments are performed on the same computer with Intel Core i5 CPU 3.40 GHz and 8 GB RAM. GeForce GTX 650 GPU is being used for deep trackers.

@article{fiaz@2019handcrafted,

title={Handcrafted and Deep Trackers: Recent Visual Object Tracking Approaches and Trends},

author={Fiaz, Mustansar and Mahmood, Arif and Javed, Sajid and Jung, Soon Ki},

journal={ACM Computing Surveys},

year={2019},

publisher={ACM} }