Comparisons in terms of success plots on AVisT. The AUC scores are given in the legend.
AVisT offers a dedicated Visual Object Tracking dataset that covers a variety of adverse scenarios highly relevant to real-world applications. Importantly, it poses additional challenges to the tracker design due to adverse visibility.
AVisT covers a wide range of 18 diverse scenarios including rain, fog, hurricane, fire, sun glare, low-light, archival videos, fast motion, distractor objects, occlusion, snow, sandstorm, tornado, smoke, splashing water, camouflage, small objects and deformation. These diverse scenarios are broadly categorized into five attributes: weather conditions, obstruction effects, imaging effects, target effects and camouflage.
AVisT dataset comprises 120 challenging videos from YouTube under the Creative Commons license with a total of approximately 80k (79653) annotated frames. The frame-rates of these videos ranges from 24 to 30 frames per second (fps) and the average sequence length is 664 frames (i.e., 22.2 seconds with 30 fps). The shortest sequence in the dataset has 99 frames (3.3 seconds with 30 fps), while the longest one has 3113 frames (103.7 seconds with 30 fps).
Directory Structure
root folder
> sequences
> folders containing the frames of each video sequence
> anno
> text files containing bounding boxes for each sequence (x, y, w, h)
> full_occlusion
> text files containing full occlusion flag for each frame (0 or 1)
> out_of_view
> text files containing out of view flag for each frame (0 or 1)
@article {noman2022avist,
title = {AVisT: A Benchmark for Visual Object Tracking in Adverse Visibility},
author = {Noman, Mubashir and Ghallabi, Wafa Al and Najiha, Daniya and Mayer, Christoph and Dudhane, Akshay and Danelljan, Martin and Cholakkal, Hisham and Khan, Salman and Van Gool, Luc and Khan, Fahad Shahbaz},
journal= {33rd British Machine Vision Conference},
year = {2022}
}