ICPR 2024 Challenge on VISual Tracking in Adverse Conditions
(VISTAC)

27th International Conference on Pattern Recognition (ICPR), December 01-05, 2024, Kolkata, India

ANNOUNCEMENT TO ALL PARTICIPANTS

Test Dataset is live  

Access link -> Click here 

Introduction

Competition Schedule

The evolution of video databases is crucial in understanding complex spatiotemporal dynamics and extracting semantic content from video data. Understanding "Spatiotemporal Semantic Content" means deciphering how objects and actions within a scene evolve in meaning and significance across time and space. This is vital in video surveillance, where interpreting physical movements and contextual behaviors – such as flagging suspicious activities based on temporal patterns – is paramount.

Nighttime video analysis presents unique challenges due to low visibility, haze, and poor lighting conditions.  Sophisticated algorithms are essential for extracting actionable insights in these difficult scenarios.  These advancements directly benefit surveillance, security, and autonomous navigation systems. Despite the progress fueled by deep learning and large datasets, a critical gap exists: the lack of specialized, publicly accessible datasets focused on nighttime conditions. To address this need, we introduce the Night Vision Spatiotemporal Infrared-Video Dataset (NV-SID). NV-SID comprises 100 meticulously annotated nighttime infrared videos, providing accurate ground truth data for object tracking in nocturnal environments.

The NV-SID can significantly impact the field of nighttime video analysis. By providing a high-quality, specialized dataset, researchers can develop and refine algorithms tailored to the unique challenges of nighttime conditions. This will directly contribute to the improvement of nighttime surveillance and navigation systems. We also presented the qualitative precision (QP) metric to establish a new benchmark for evaluating machine learning-based object-tracking algorithms. QP is designed to assess the accuracy and reliability of algorithms operating within the demanding context of nighttime video analysis. This initiative will drive progress by giving researchers a robust tool to evaluate and enhance their technological capabilities.

Registration Link for VISTAC

Awards of VISTAC Challange

Competition Outline

Over the years, object tracking has predominantly been conducted on videos captured in natural or well-lit environments. This focus has led to significant advancements in tracking algorithms, enhancing their robustness across various challenging scenarios. However, the low-light and nighttime object tracking still needs to be explored. Furthermore, while infrared (IR) imaging offers promising advantages for object tracking—due to its lower sensitivity to lighting conditions and appearance variability—its potential is not yet fully harnessed. Research on object tracking using IR videos is significantly less developed than visible spectrum imaging. 

Historically, only a handful of studies have effectively combined spatiotemporal information extracted from IR videos for object tracking. Recognizing the need to fill this gap, our project aims to leverage IR video for object tracking, making this dataset a foundational step towards developing specialized algorithms for single-object tracking in nighttime environments aided by infrared video data.

The NV-SID contains 100 videos, from which we will publish 80 videos with ground truth for training and validation on May 5th, 2024. The rest of the 20 videos will be published without any ground truth on May 25th, 2024. 

The ground truths are in (X1, Y1, W, H, IQA) format, where (X1, Y1) are the top-left coordinates of the bounding box of the object, (W,H) are the width and height of the bounding box respectively. IQA represents the visual quality of the frame given by a pure fraction, where a numerically higher value denotes higher image distortion. The IQA value is provided as an extra tool, which the participants may utilize to design their tracking models and/or enhance the tracking performance, to their own liberty.

Participants in the competition are required to submit the tracking results in form of a .json file named "submission.json". The file should contain tracking results in form of a nested list = [[X1,Y1,W1,H1],[X2,Y2,W2,H2],....], where 1,2,3,...represent the frame number for a video sequence. Hence, the "submission.json" file content shall be as follows:

{ “<sequence-1>”: [[X1,Y1,W1,H1],[X2,Y2,W2,H2],....] , “<sequence-2>”: [[X1,Y1,W1,H1],[X2,Y2,W2,H2],....], ...... ,“<sequence-n>”: [[X1,Y1,W1,H1],[X2,Y2,W2,H2],....] }

For examples, refer to the dataset link and view the "train.json" and "validation.json" files.

More than one submission from a team will result in the organizers taking the last submission as the final one.

The winner will be determined based on the team achieving the highest Qualitative Precision (QP) performance on the test sequences.  

a) Tracking results for the test videos. 

b) A concise one-page report outlining the methodology employed in the tracking algorithm.

**Please note that the proposed machine learning model needs to be trained solely on the training dataset of the NV-SID, fine-tuning of the pre-trained machine learning models are not permitted. 

Fig 1: Task Chart of the competition

 Competition Objectives

This competition is designed to achieve several key goals: 

Please consider citing our dataset if you use NVISOT for your research!

For queries and suggestions, contact us at: nvisot.ju.etce@gmail.com