Cloud-Enabled Lifelong LERF for Semantic Inventory Monitoring

Adam Rashid1*, Chung Min Kim1*, Justin Kerr1*, Letian Fu1, Kush Hari1, Ayah Ahmad1, Kaiyuan Chen2, Huang Huang1, Marcus Gualtieri3, Michael Wang3, Christian Juette3, Tian Nan3, Liu Ren3, Ken Goldberg1 

*Equal contribution, 1The AUTOLab at 2UC Berkeley, 3Bosch

Abstract

Many environments require a navigating robot to monitor for extensive periods, during which objects in the scene may shift significantly. In particular, inventory monitoring relies on maintaining an up-to-date semantic capture despite objects being swapped, added, removed, or moved. This work introduces Lifelong LERF, a cloud-robotics method that allows a robot with minimal compute to estimate camera poses from a monocular camera and jointly optimize a dense language and geometric representation of its surroundings. We maintain this representation over time by detecting semantic changes and selectively updating these regions of the environment, avoiding the need to exhaustively remap. After updating, a human user can monitor inventory by providing natural language queries and receiving a 3D heatmap of potential object locations. To manage the computational load required, we use Fog-ROS2, a cloud robotics platform, to offload resource-intensive tasks to the cloud, enabling online performance. Experiments in a tabletop environment with 3-5 objects using a Turtlebot with a RealSense camera demonstrate Lifelong LERF can persistently track objects moving over time with overall 91% accuracy.