Home

IEEE Transactions on Circuits and Systems for Video Technology, 2024

A Volumetric Saliency Guided Image Summarization for RGB-D Indoor Scene Classification

Preeti Meena, Himanshu Kumar, Sandeep Yadav

Abstract

Image summary, an abridged version of the original visual content, can be used to represent the scene. Thus, tasks such as scene classification, identification, indexing, etc., can be performed efficiently using the unique summary. Saliency is the most commonly used technique for generating the relevant image summary. However, the definition of saliency is subjective in nature and depends upon the application. Existing saliency detection methods using RGB-D data mainly focus on color, texture, and depth features. Consequently, the generated summary contains either foreground objects or non-stationary objects. However, applications such as scene identification require stationary characteristics of the scene, unlike state-of-the-art methods. This paper proposes a novel volumetric saliency-guided framework for indoor scene classification. The results highlight the efficacy of the proposed method.

Paper: A Volumetric Saliency Guided Image Summarization for RGB-D Indoor Scene Classification

Method

Results

Fig.6: Saliency map comparison with SOTA methods.

Tab. V: Sample results for scene classification task using visual summaries.

(B= bedroom, LR=living room, RR=reception room, CF=cafe, SD=study, HO= home-office, CR=classroom, O=office, KT= kitchen. )

Fig. 7: Illustration of impact of using different saliency maps (a) RGB image, (b) GT, (c)-(f) Saliency maps computed using contrast, depth, volume, proposed method (volumetric information+semantic information).

Other Materials

Supp.pdf : This file contains more results and details about proposed method.

Multimedia files : .rar file contains all the figures.

GT : This folder contains the ground truth Volumetric saliency maps for evaluation.

Code

Image-Summarization

Citation

@article{meena2024volumetric,

title={A Volumetric Saliency Guided Image Summarization for RGB-D Indoor Scene Classification},

author={Meena, Preeti and Kumar, Himanshu and Yadav, Sandeep},

journal={IEEE Transactions on Circuits and Systems for Video Technology},

year={2024},

publisher={IEEE}

}

References

[8] A. Li, Y. Mao, J. Zhang, and Y. Dai, “Mutual information regularization for weakly-supervised rgb-d salient object detection,” IEEE Transactions on Circuits and Systems for Video Technology, 2023.

[9] R. Cong, J. Lei, H. Fu, J. Hou, Q. Huang, and S. Kwong, “Going from rgb to rgbd saliency: A depth-guided transformation model,” TCYB, vol. 50, no. 8, pp. 3627–3639, 2019.

[40] D.-P. Fan, Z. Lin, Z. Zhang, M. Zhu, and M.-M. Cheng, “Rethinking rgb-d salient object detection: Models, data sets, and large-scale bench- marks,” IEEE Transactions on neural networks and learning systems, vol. 32, no. 5, pp. 2075–2089, 2020.

[41] J. Lou, H. Wang, L. Chen, F. Xu, Q. Xia, W. Zhu, and M. Ren, “Exploiting color name space for salient object detection,” Multimedia Tools and Applications, vol. 79, pp. 10 873–10 897, 2020.

[61] M. Zhang, S. Yao, B. Hu, Y. Piao, and W. Ji, “C2dfnet: Criss- cross dynamic filter network for rgb-d salient object detection,” IEEE Transactions on Multimedia, 2022.

[65] Z. Liu, Y. Tan, Q. He, and Y. Xiao, “Swinnet: Swin transformer drives edge-aware rgb-d and rgb-t salient object detection,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, no. 7, pp. 4486– 4497, 2021.

[76] Z. Zhang, Z. Lin, J. Xu, W.-D. Jin, S.-P. Lu, and D.-P. Fan, “Bilateral attention network for rgb-d salient object detection,” IEEE transactions on image processing, vol. 30, pp. 1949–1961, 2021

Page updated

Report abuse