Projects

A snippet of my series of works in Human-Object Interaction Detection

Detecting and Localizing Human-Object Interactions

April 2019 -

My series of works in human-object interaction (HOI) detection have investigated different refinement techniques of image features. I introduced the idea of spatial refinement for HOI detection. In the recent works, I combined spatial and semantic refinement together in both traditional two-stage networks and the recent Transformer-based one-stage architectures. I collaborated with researchers from UCSB and AWS AI Labs in these works. Our efforts have resulted in three HOI detection networks. Among these networks, SSRT and VSGNet have published in CVPR.


Our proposed network's (LOCL) performance against SOTA models.

Understanding Object-Attribute Compositionality

May 2020 - Dec 2021

Humans can visualize new and unknown objects-attribute compositions by blending unrelated object and attributes. For example, one can envision a blue apple by conjuring the commonly seen red apple and replacing its color with blue. The problem of unseen O-A associations has been well studied in the field, however, the performance of existing methods is limited in challenging scenes. In this context, our key contribution is a modular approach to localizing objects and attributes of interest in a weakly supervised context that generalizes robustly to unseen configurations. Localization coupled with a composition classifier significantly outperforms state-of-the-art (SOTA) methods, with an improvement of about 12% on currently available challenging datasets. Further, the modularity enables the use of localized feature extractor to be used with existing O-A compositional learning methods to improve their overall performance.
[Paper] [Code]


Underwater Video Analytics

January 2020-

For this project we are collaborating with MARE Group. Mare group has successfully collected hours of underwater videos in the different parts of the oceans by their underwater vehicles. We are closely working with them in developing an algorithm that will help to understand these videos by automatically detecting and counting the fishes, plants and other underwater species.

Detecting Stress From Thermal Videos

May 2020 - Sept 2020

In this work we are estimating an important physiological signal named Initial Systolic Time Interval(ISTI) from thermal videos. ISTI is a significant indicator of various physiological phenomenon like stress, blood pressure. We have developed a spatial temporal network to estimate ISTI. Using this signal we have successfully detected physically induced stress.
[Paper] [Code]


Our network successfully detected the forged part in an image(masked region).

Copy Paste Forgery Detection in Images

October 2019- December 2019

One of the most common types of forgery in images is the copy-paste forgery. In this work, we have extended the already existing MANTRANET network by developing a morphology based segmentation technique. [Slides] [Code]


Our network successfully predicted bowling instance in this random video collected from YouTube.

Temporal Action Detection in Videos

Jan 2019- March 2019 (Course Project for ECE 194)

Detecting and localizing actions in videos are extremely challenging due to the simultaneous detection and realizations of video clips in both spatial and temporal directions. In this work we improved the proposal method of the already existing Structural Segment Network by combining a proposal method based on actionness score and sliding window.We have achieved better result than the Structural Segment Network. [Slides] [Short Report]