Few-Shot Moving-Object Detection and Background Subtraction for Passive Visual Monitoring in Underwater Scenes
Passive Visual Monitoring (PVM) increasingly relies on camera-based systems to study wildlife at scale while minimizing field disturbance. In terrestrial habitats, camera traps deliver multi-year, multi-species image streams; underwater platforms capture long videos but suffer severe optical degradations. We present a unified, light-weight pipeline tailored to both settings that builds on four complementary components from our prior work: robust foreground proposal via background subtraction (BgSub), motion-aware change discrimination using a Siamese design (MODSiam), precise instance masks through a multiscale output ensemble Y-network (MODY-Net), and an exemplar-guided few-shot module for rare-species detection using attention to support low-label regimes. Concretely, BgSub provides spatiotemporal proposals resilient to illumination flicker, vegetation sway, and underwater clutter; MODSiam suppresses false positives from dynamic backgrounds (e.g., foliage, caustics); MODY-Net refines object boundaries to enable downstream measurements (size/pose/behavior cues); and the few-shot attention branch scales identification to long-tail taxa typical of ecological surveys. We demonstrate the framework underwater videos (e.g., Fish4Knowledge), reporting visual results and qualitative analyses aligned with PVM goals. To mitigate aquatic color cast and blur, we optionally pre-enhance inputs using established underwater benchmarks and priors (UIEB, EUVP), improving visibility of fine anatomical cues before detection/segmentation.