Deep learning

Research

Aerial image retrieval using graph convolution

Graphical formulation of aerial images have the potential to describe RS scenes better [1].
We establish that graph convolution-based networks perform better than CNNs for serial both single-label and multi-label image classification and retrieval tasks [2].
Also, putting more attention to the most important areas within a region and neighborhoods (edges) lead to the class decision. We propose a novel edge attention mechanism to tackle the same [3].

1. Chaudhuri, Ushasi, Biplab Banerjee, and Avik Bhattacharya. "Siamese graph convolutional network for content based remote sensing image retrieval." Computer vision and image understanding 184 (2019): 22-30.

2. Khan, Nagma, Ushasi Chaudhuri, Biplab Banerjee, and Subhasis Chaudhuri. "Graph convolutional network for multi-label VHR remote sensing scene recognition." Neurocomputing 357 (2019): 36-46.

3. Chaudhuri, Ushasi, Biplab Banerjee, Avik Bhattacharya, and Mihai Datcu. "Attention-driven graph convolution network for remote sensing image retrieval." IEEE Geoscience and Remote Sensing Letters 19 (2021): 1-5.

Cross-sensor data retrieval

We develop deep learning based model for cross-modal retrieval in RS. The following are some of the important cross-modal retrieval applications in RS:

- SAR - Multispectral

-RGB - Depth

-Image - Speech

-Panchromatic - Multispectral

-Hyperspectral - LiDAR

-Image - Sketch

Cross-modal retrieval can include cross-sensor [4], cross-media [5], and cross-resolution [6] retrieval.

4. Chaudhuri, Ushasi, Biplab Banerjee, Avik Bhattacharya, and Mihai Datcu. "CMIR-NET: A deep learning based model for cross-modal retrieval in remote sensing." Pattern recognition letters 131 (2020): 456-462.

5. Chaudhuri, Ushasi, Biplab Banerjee, Avik Bhattacharya, and Mihai Datcu. "Attention-driven cross-modal remote sensing image retrieval." In 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS, pp. 4783-4786. IEEE, 2021.

6. Chaudhuri, Ushasi, Subhadip Dey, Mihai Datcu, Biplab Banerjee, and Avik Bhattacharya. "Interband retrieval and classification using the multi-labeled sentinel-2 BigEarthNet archive." IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 14 (2021): 9884-9898.

Zero-shot cross-modal data retrieval

In practical applications, a model is trained on seen classes (e.g., cauliflower), however it may encounter unseen classes (e.g., broccoli) upon deployment.

How do we handle such situation and deploy robust models? Develop zero-shot retrieval model!

To this end, we leverage the semantic information of classes [7, 8, 9].

7. Chaudhuri, Ushasi, Biplab Banerjee, Avik Bhattacharya, and Mihai Datcu. "CrossATNet-a novel cross-attention based framework for sketch-based image retrieval." Image and Vision Computing 104 (2020): 104003.

8. Chaudhuri, Ushasi, Biplab Banerjee, Avik Bhattacharya, and Mihai Datcu. "A zero-shot sketch-based intermodal object retrieval scheme for remote sensing images." IEEE Geoscience and Remote Sensing Letters 19 (2021): 1-5.

9. Chaudhuri, Ushasi, Biplab Banerjee, Avik Bhattacharya, and Mihai Datcu. "A simplified framework for zero-shot cross-modal sketch data retrieval." In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 182-183. 2020.

Zero-shot cross-modal data retrieval with minimal supervision

The performance of a deep-learning-based model primarily relies on the diversity and size of the training dataset. However, obtaining such a large amount of labeled data for practical remote sensing (RS) applications is expensive and labor-intensive.

Training protocols have been previously proposed for few-shot learning (FSL) and zero-shot learning (ZSL). However, FSL is not compatible with handling unobserved class data at the inference phase, while ZSL requires many training samples of the seen classes. In this work, we propose a novel training protocol for image retrieval and name it as label-deficit zero-shot learning (LDZSL). We use this novel LDZSL training protocol for the challenging task of cross-sensor data retrieval in RS. This protocol uses very few labeled data samples of the seen classes during training and interprets unobserved class data samples at the inference phase. This strategy is critical as some data modalities are hard to annotate without domain experts [10].

10. Chaudhuri, Ushasi, Rupak Bose, Biplab Banerjee, Avik Bhattacharya, and Mihai Datcu. "Zero-shot cross-modal retrieval for remote sensing images with minimal supervision." IEEE Transactions on Geoscience and Remote Sensing 60 (2022): 1-15.

Page updated

Google Sites

Report abuse