This Place Looks Familiar: Vision-based Semantic Mapping of Indoor Scenes
The increasing demand for navigation services in mobile robotics highlights its crucial role in enabling robots to operate autonomously in wide-ranging environments, especially terrestrial applications. Autonomy is paramount for robots. to carry out several tasks without human intervention, including navigating through complex terrain and indoor spaces. Autonomous navigation holds great importance in terrestrial robotics, where ground robots traverse different scenarios, impacting sectors such as industry, logistics, and surveillance.
Ground robots can navigate through environments with distinct local features including indoor and outdoor scenes, usually corresponding to structured and unstructured scenarios, respectively. Geometric features support the mentioned navigation and are commonly employed to depict places and their relation in space. Regarding indoor environments, despite its usually structured scenes, it is necessary more than only geometric features for efficient high-level navigation, such as semantic information.
The employment of semantics for autonomous navigation has been the focus of numerous research efforts related to indoor environments. In this sense, semantics refers to the understanding and interpretation of the environment by the autonomous system. It involves assigning meaning to different elements of the environment, such as objects, landmarks, and obstacles, and facilitating effective navigation. To achieve this, the use of visual information can incorporate the capacity of identifying and semantic mapping the scenes traversed by the ground robot.
In this paper, we propose an approach to classify indoor scenes from visual information, semantically mapping the environment, using Deep Learning. To achieve this, an integrated representation learning model was used by estimating a semantic label on images representing indoor environments during the ground robot navigation. The integrated representation learning model is composed of three components: i) an autoencoder, for representation learning from images of indoor scenes; ii) a YOLO object detection model used for representing semantic information from images of indoor scenes; and iii) a CNN representation learning model, for representation learning, learning merging and semantic classification through a supervised process. In the experiments, we used two well-established long-term image datasets, comprising indoor scenes captured considering different scenes and lighting conditions. Additionally, a comparative analysis was performed, comparing our results with other state-of-the-art (SOTA) techniques in the field. The results validate the effectiveness of our proposed approach for classifying indoor scenes from visual information, highlighting a significant accuracy improvement in semantic classification and mapping for indoor scenes.
Our main contributions to this work are summarized as follows:
We propose a novel integrated representation learning model for the semantic classification of images from indoor environments. The proposed learning process combines visual and semantic features to recognize distinct places. For this, an autoencoder and a CNN model were used to learn and represent visual features, and a YOLO model was used for learning and representing semantic features from the images. The proposed vision-based semantic classification approach presents high accuracy even in different datasets, lighting, and weather conditions;
We propose a semantic map representation for indoor environments from images acquired during ground robot navigation. The semantic map associates semantic labels to the coordinates traversed by the ground robot, providing resources for high-level navigation in indoor environments.
Publications:
Acknowledgments:
Team:
José Marcos C. Neto, Graduate Student at Universidade Federal do Amazonas (UFAM)
Alternei de S. Brito, Assitant Professor at Universidade Federal do Amazonas (UFAM)
Paulo L. J. Drews-Jr, Associate Professor at Universidade Federal do Rio Grande (FURG)
Douglas G. Macharet, Associate Professor at Universidade Federal de Minas Gerais (UFMG)
João M. B. Calvalcanti, Associate Professor at Universidade Federal do Amazonas (UFAM)
José L. S. Pio, Associate Professor at Universidade Federal do Amazonas (UFAM)
Felipe G. Oliveira, Adjunct Professor at Universidade Federal do Amazonas (UFAM)