The Night Vision Spatiotemporal Infrared-Video Dataset (NV-SID) dataset aims to provide a dedicated platform for testing state-of-the-art trackers in night vision conditions. NV-SID features:
The dataset contains 100 video sequences comprising around 40,000 densely annotated frames, captured with a 30 fps infrared camera.
All the annotations have been done manually, frame by frame, and cross-checked to prevent any fallacies.
Several practical challenges, such as lower resolution, challenging quality of recording equipment, camera instability, etc., have been incorporated to simulate real-world conditions.
A large number of objects belonging to a wide range of classes have been tracked, from the human face, torso, hands, etc. to objects like ping pong balls, beyblades, books, cups, etc.
Care has been taken to include visual challenges like occlusion, motion blur, fast motion, background clutter, in-plane rotation, out-of-plane rotation, object morphing, scale variation, illumination variation, etc.
A few annotation examples of the NV-SID