IEEE Transactions on Circuits and Systems for Video Technology, 2024
A Volumetric Saliency Guided Image Summarization for RGB-D Indoor Scene Classification
Preeti Meena, Himanshu Kumar, Sandeep Yadav
IEEE Transactions on Circuits and Systems for Video Technology, 2024
A Volumetric Saliency Guided Image Summarization for RGB-D Indoor Scene Classification
Preeti Meena, Himanshu Kumar, Sandeep Yadav
Image summary, an abridged version of the original visual content, can be used to represent the scene. Thus, tasks such as scene classification, identification, indexing, etc., can be performed efficiently using the unique summary. Saliency is the most commonly used technique for generating the relevant image summary. However, the definition of saliency is subjective in nature and depends upon the application. Existing saliency detection methods using RGB-D data mainly focus on color, texture, and depth features. Consequently, the generated summary contains either foreground objects or non-stationary objects. However, applications such as scene identification require stationary characteristics of the scene, unlike state-of-the-art methods. This paper proposes a novel volumetric saliency-guided framework for indoor scene classification. The results highlight the efficacy of the proposed method.
(B= bedroom, LR=living room, RR=reception room, CF=cafe, SD=study, HO= home-office, CR=classroom, O=office, KT= kitchen. )
Supp.pdf : This file contains more results and details about proposed method.
Multimedia files : .rar file contains all the figures.
GT : This folder contains the ground truth Volumetric saliency maps for evaluation.
@article{meena2024volumetric,
title={A Volumetric Saliency Guided Image Summarization for RGB-D Indoor Scene Classification},
author={Meena, Preeti and Kumar, Himanshu and Yadav, Sandeep},
journal={IEEE Transactions on Circuits and Systems for Video Technology},
year={2024},
publisher={IEEE}
}
[8] A. Li, Y. Mao, J. Zhang, and Y. Dai, “Mutual information regularization for weakly-supervised rgb-d salient object detection,” IEEE Transactions on Circuits and Systems for Video Technology, 2023.
[9] R. Cong, J. Lei, H. Fu, J. Hou, Q. Huang, and S. Kwong, “Going from rgb to rgbd saliency: A depth-guided transformation model,” TCYB, vol. 50, no. 8, pp. 3627–3639, 2019.
[40] D.-P. Fan, Z. Lin, Z. Zhang, M. Zhu, and M.-M. Cheng, “Rethinking rgb-d salient object detection: Models, data sets, and large-scale bench- marks,” IEEE Transactions on neural networks and learning systems, vol. 32, no. 5, pp. 2075–2089, 2020.
[41] J. Lou, H. Wang, L. Chen, F. Xu, Q. Xia, W. Zhu, and M. Ren, “Exploiting color name space for salient object detection,” Multimedia Tools and Applications, vol. 79, pp. 10 873–10 897, 2020.
[61] M. Zhang, S. Yao, B. Hu, Y. Piao, and W. Ji, “C2dfnet: Criss- cross dynamic filter network for rgb-d salient object detection,” IEEE Transactions on Multimedia, 2022.
[65] Z. Liu, Y. Tan, Q. He, and Y. Xiao, “Swinnet: Swin transformer drives edge-aware rgb-d and rgb-t salient object detection,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, no. 7, pp. 4486– 4497, 2021.
[76] Z. Zhang, Z. Lin, J. Xu, W.-D. Jin, S.-P. Lu, and D.-P. Fan, “Bilateral attention network for rgb-d salient object detection,” IEEE transactions on image processing, vol. 30, pp. 1949–1961, 2021