Research Professor – Computer Vision (04.2025-present)
Project title : Accurate Autonomous Tracking Control by Unmatched Disturbance Compensator with Reference Re-design Filter
Developed advanced Siamese-based tracking systems integrating attention mechanisms, location quality estimation, and distribution-based regression to improve object tracking robustness in real-world video streams.
Designed and implemented TSDTrack, a Transformer-enhanced tracker achieving stable and accurate performance under occlusion, scale variation, and motion blur.
Leading the creation of DynaTrack, a dynamic memory-augmented transformer tracker with a hybrid CNN-ViT backbone and deformable cross-attention, built for robustness in complex and long-duration tracking scenarios.
Engineered complete training/testing pipelines in PyTorch, supporting datasets such as LaSOT, GOT-10k, COCO, and TrackingNet, with utilities for logging, evaluation, checkpointing, and cross-dataset testing.
Optimized tracking frameworks for scalability and experimentation, and performed comparative analysis of architectural designs for enhanced generalization across video domains.
Student Researcher – Computer Vision (03.2021-02.2025)
Thesis title : A Study on Enhanced Precision in Visual Object Tracking Using Correlation Filters and Siamese Architectures
Designed and implemented a multi-level Siamese network for object tracking using PyTorch, integrating attention mechanisms and keypoint prediction modules to achieve high accuracy under occlusions and complex backgrounds.
Developed and applied a rank-based context learning approach with sparse spatial regularization (SSR) using omnidirectional context patches, implemented in MATLAB with correlation filters, to enhance robustness and reduce template drift in long-term tracking tasks on datasets like UAV123 and LaSOT.
Developed a robust video object detection model using a Hybrid Multi-Attention Transformer, improving detection accuracy and robustness against occlusions and motion blur; implemented and evaluated the model on benchmark datasets using PyTorch.
Designed a semi-supervised video object segmentation framework using a one-shot learning approach with fully convolutional neural networks (FCNs), achieving state-of-the-art performance on benchmarks like DAVIS and YouTube-VOS, with improved robustness in dynamic and complex scenes.
Published results in peer-reviewed journals, contributing novel methodologies in tracking, detection, and segmentation within the field.