PUBLICATIONS
Towards Active Vision for Action Localization with Reactive Control and Predictive Learning Shubham Trehan and Sathyanarayanan N. Aakur [WACV 2022]
In this groundbreaking research, we introduce an innovative approach to active action localization, where an active camera dynamically adjusts to keep actions within its field of view.
Key Highlights
Active Camera Control: Our system dynamically adjusts the camera’s orientation to follow the action, reducing the chances of losing track due to occlusion or rapid movements.
Predictive Learning: By anticipating future frames, our model maintains a continuous focus on the action, enhancing tracking accuracy.
Real-World Applications: From surveillance to assistive robotics, our approach promises significant improvements in scenarios where maintaining the visual focus on moving subjects is critical.
ProtoKD: Learning from Extremely Scarce Data for Parasite Ova Recognition Shubham Trehan Udhav Ramachandran Ruth Scimeca Sathyanarayanan N. Aakur [ICMLA 2023]
Prototypical Networks: Leverages the strength of Prototypical Networks coupled with Knowledge distillation to create robust class prototypes, facilitating effective few-shot learning.
Application to Medical Datasets: Specifically designed to handle scenarios with extremely limited data, demonstrating significant improvements in medical image classification tasks.
Discovering Novel Actions from Open World Egocentric Videos with Object-Grounded Visual Commonsense Reasoning Sanjoy Kundu Shubham Trehan and Sathyanarayanan N. Aakur [ECCV 2024]
Dive into our innovative research on discovering novel actions from egocentric videos using a neuro-symbolic framework. This work combines the power of object-centric vision-language models and large-scale knowledge bases to achieve unparalleled performance in open-world activity inference.
Neuro-Symbolic Framework: Integrates symbolic knowledge and visual data to ground objects and infer actions in open-world settings.
Object-Driven Activity Discovery: Uses prior knowledge to discover plausible actions, leveraging object affordances for accurate activity recognition.
Self-supervised Multi-actor Social Activity Understanding in Streaming Videos Shubham Trehan and Sathyanarayanan N. Aakur [ICPR 2024]
Explore our latest research on social activity recognition that tackles the complexities of detecting and understanding simultaneous actions and social interactions among multiple actors in streaming videos. This self-supervised framework leverages multi-actor predictive learning and visual-semantic graphs to deliver robust performance with minimal labeled data.
Self-Supervised Learning: Achieves high accuracy in social activity recognition without relying on densely annotated data.
Multi-Actor Predictive Learning: Jointly models individual actions and group-level activities, enhancing the understanding of social interactions.
Visual-Semantic Graphs: Uses graph structures to represent social interactions, enabling relational reasoning and improved performance on standard benchmarks.