SplatUnity: A Collaborative Online SLAM and Learning Framework with 3D Gaussian Splatting and Inter-Agent Pose Alignment
SplatUnity: A Collaborative Online SLAM and Learning Framework with 3D Gaussian Splatting and Inter-Agent Pose Alignment
Authors: Anonymous
Paper: TBD
Code: TBD
Abstract
Real-time, multi-agent systems capable of jointly estimating motion and reconstructing dense 3D environments are key to the next generation of spatially intelligent robotics. SplatUnity, a fully decentralized SLAM framework, is introduced. It employs 3D Gaussian Splatting (3DGS) to enable high-fidelity mapping and robust pose estimation across multiple agents. Unlike prior learning-based SLAM systems that rely on centralized coordination or limited scalability, SplatUnity allows each agent to independently build local submaps, detect overlapping regions, and estimate relative poses using a bird’s-eye view (BEV) matching pipeline refined via colored ICP. Submaps are registered into a shared global frame and merged through a loop-closure-aware optimization strategy. Experimental results on both synthetic and real-world multi-agent datasets demonstrate that SplatUnity achieves superior performance in trajectory accuracy and rendering quality compared to state-of-the-art baselines, even under unknown initial pose conditions.
Registering submaps from both agents -Multi-Replica Dataset - Apartment 2
Registering submaps from both agents -Multi-Replica Dataset - Apartment 1
Registering submaps from both agents -Multi-Replica Dataset - Apartment 0
Registering submaps from both agents -Multi-Replica Dataset - Office 0
Overview of the proposed SplatUnity pipeline. Each agent performs local mapping and tracking using RGB-D data, independently constructing submaps in its own coordinate frame. When connectivity and map overlap are detected, inter-agent relative poses are estimated via BEV-based matching and refined using colored ICP. Submaps are then registered into a shared global frame, merged, and optimized to maintain global consistency. The system supports unknown relative pose between robots, multi-agent loop closure, and collaborative optimization.
Comparison between single-agent and multi-agent reconstructions across multiple environments.
Trajectories and starting points of each agent in selected scenes from the 7-Scenes dataset. Each color represents a distinct agent, and star markers indicate the initial poses. The relative translational offsets between agents and the reference agent range from 0.1 m to 2.5 m, showcasing the diversity in spatial configurations used to evaluate multi-agent scalability.
If you have any questions, feel free to reach out to us at the following email us at: TBD
TBD