If you're interested in working on any of these Ongoing Projects or have an interesting research direction to extend any of these projects, send me your profile at depanshus@iiitd.ac.in and we will get back to you if there's any suitable position.

Ongoing Projects

Graph-Based Statistical Analysis of Entire Scenes by Combining Multi-Sensor, Multi-Perspective Video Streams (jointly with Dr. Anuj Srivastava, Florida State University)
3D Multi-Object Tracking for Indian Roads using Stereo Camera Setup (jointly with Dr. Anoop Chawla, Department of Mechanical Engineering, Indian Institute of Technology)
Analyzing the Impact of Aridification on Agricultural Production in the Cauvery Delta using Multi-Modal Cross-Satellite Data (jointly with Dr. Thiagarajan Jayaraman, M S Swaminathan Research Foundation)
Robust Feature Representation Based on Hierarchical Priors for Object Detection
Uncertainty Estimation in Multi-Object Tracking
Developing theories for Optimal State Estimation of Dynamic Graph Systems

Completed Projects

SICKLE: A Multi-Sensor Satellite Imagery Dataset Annotated With Multiple Key Cropping Parameters

Accepted at WACV 2024 as an oral presentation

The availability of well-curated datasets has driven the success of Machine Learning (ML) models. Despite greater access to earth observation data in agriculture, there is a scarcity of curated and labelled datasets, which limits the potential of its use in training ML models for remote sensing (RS) in agriculture. To this end, we introduce a first-of-its-kind dataset called SICKLE, which constitutes a time-series of multi-resolution imagery from 3 distinct satellites: Landsat-8, Sentinel-1 and Sentinel-2. Our dataset constitutes multi-spectral, thermal and microwave sensors during January 2018 - March 2021 period. We construct each temporal sequence by considering the cropping practices followed by farmers primarily engaged in paddy cultivation in the Cauvery Delta region of Tamil Nadu, India; and annotate the corresponding imagery with key cropping parameters at multiple resolutions (i.e. 3m, 10m and 30m). Our dataset comprises 2, 370 season-wise samples from 388 unique plots, having an average size of 0.38 acres, for classifying 21 crop types across 4 districts in the Delta, which amounts to approximately 209, 000 satellite images. Out of the 2, 370 samples, 351 paddy samples from 145 plots are annotated with multiple crop parameters; such as the variety of paddy, its growing season and productivity in terms of per-acre yields. Ours is also one among the first studies that consider the growing season activities pertinent to crop phenology (spans sowing, transplanting and harvesting dates) as parameters of interest. We benchmark SICKLE on three tasks: crop type, crop phenology (sowing, transplanting, harvesting), and yield prediction. More details are available here.

Learning Hierarchy Aware Features for Reducing Mistake Severity

Accepted at ECCV 2022 as a poster presentation

Label hierarchies are often available a priori as part of biological taxonomy or language datasets WordNet. Several works exploit these to learn hierarchy-aware features in order to improve the classifier to make semantically meaningful mistakes while maintaining or reducing the overall error. In this paper, we propose a novel approach for learning Hierarchy Aware Features (HAF) that leverages classifiers at each level of the hierarchy that are constrained to generate predictions consistent with the label hierarchy. The classifiers are trained by minimizing a Jensen-Shannon Divergence with target soft labels obtained from the fine-grained classifiers. Additionally, we employ a simple geometric loss that constrains the feature space geometry to capture the semantic structure of the label space. HAF is a training time approach that improves the mistakes while maintaining top-1 error, thereby, addressing the problem of cross-entropy loss that treats all mistakes as equal. We evaluate HAF on three hierarchical datasets and achieve state-of-the-art results on the iNaturalist-19 and CIFAR-100 datasets. The source code is available here.

Using the proposed approach, we developed a tool, CaTRAT (Camera Trap Data Repository and Analysis Tool), which is now used for “All India Tiger Estimation” by the government of India.

Face Expressions and Movement Based Pattern Matching Authentication Mechanism

Developed a novel user authentication mechanism free from replay and spoofing attacks. The solution captures a sequence of frames, instead of a single picture in case of facial recognition, which will ideally contain different facial expressions (for instance, smile followed by wink followed by a laugh). This sequence of frames will be used as a pattern and the user will have to recreate the same facial expression pattern in order to unlock a device.

Theoretically, for a replay attack, an attacker must have the video of an authentic user with the same camera angle as it was while setting the password. An attacker can easily get the face image of the user from social media or other platforms but it is nearly impossible to get a video pattern that too at the same camera angle. For spoofing attacks, an attacker will have to use technology such as DeepFakes to render a video pattern with an authentic user's face on it. Detecting fake images/videos is not a new task and hence is easily achievable.

On-Device Dynamic Emoji Generation

Developed an on-device AI solution that creates new emoji(s) which can depict the emotions of two or more emoticons. Salient features of the developed solution:

Unlike other similar solutions (Google's Emoji Kitchen), the developed model can generate more than 1 emoticon.
The solution is capable of combining multiple emojis automatically, on-device. Unlike this, the emoji kitchen suggests hand-crafted emojis to the user.
No other solution can combine more than two emojis at once. This is one of the amazing features of this solution. It can combine any number of inputs to generate multiple outputs.
No other solution is capable of combining human face emoticons. This solution demonstrates amazing results when human face emoticons and smileys are combined together (intra- and inter-combinations).

Voice-Controlled Home Automation with Security Surveillance using Raspberry Pi

Developed voice-controlled home automation device with added security surveillance. These features were an extension of a voice-controlled personal assistant device developed during another research. Salient features of the developed device:

When no Appliance ID is specified during the action command (e.g., switch on the light) in the presence of multiple similar appliances, the device can intelligently detect the intended Appliance ID from such a homogeneous set of appliances.
The device maintains a directory in which the face of trusted users is stored. If any untrusted user (a user whose face ID is not stored in the directory) appears in front of the hidden camera, a notification is sent to the owner with a screenshot of the untrusted person's face.
In case the owner concludes that the notification received is actually due to an untrusted person, the device can be commanded to take certain measures, like informing neighbors, police, etc...