Empowering Embodied Visual Tracking with Visual Foundation Models and Offline RL

Fangwei Zhong*, Kui Wu*, Hai Ci, Churan Wang, Hao Chen

ECCV 2024

The Collected Multi-Level Data in Complex Room

Noise Level 1

Noise Level 2

 Noise Level 3

Noise Level 4

Tracking in Complex Room

Tracking in High-fidelity Environments

Tracking Unseen Targets

Tracking in Real World

Follow a Woman in a Dark Parking Lot

Follow a Cat in a Dark Parking Lot

Follow a Man in an Indoor Room with Obstacles (1)

Passively Tracking on Real-world Videos (VOT)

Citation

@inproceedings{zhong2024empowering,

  title={Empowering Embodied Visual Tracking with Visual Foundation Models and Offline RL},

  author={Zhong, Fangwei and Wu, Kui and Ci, Hai and Wang, Churan and Chen, Hao},

  booktitle={European Conference on Computer Vision},

  year={2024}

}