Autonomous driving systems traditionally consist of specialized components for perception, mapping, prediction, and planning. However, some modules like perception or planning might struggle in novel and complex scenarios. End-to-end (E2E) autonomous driving has emerged as a potential approach to address challenging scenarios. Current E2E systems combine large language models (LLMs) for scene understanding and visual data from the camera, however, lacking strong 3D spatial reasoning. Hence, fusing LiDAR data with camera images can enhance the ability of the E2E model to understand the spatial relationship and mitigate depth error.
We have developed and innovative multi-task learning framework that allows concurrent depth estimation and semantic segmentation using a single camera. The proposed approach is based on a shared encoder-decoder architecture, which integrates various techniques to improve the accuracy of the depth estimation and semantic segmentation task without compromising computational efficiency. Additionally, it incorporates an adversarial training component, employing a Wasserstein GAN framework with a critic network, to refine model's predictions.
Paper - Accepted in IROS 2024 !
We have developed a novel system designed for 3D mapping and visual localization using 3D Gaussian Splatting. Our proposed method uses LiDAR and camera data to create accurate and visually plausible representations of the environment. By initiating the training of the 3D Gaussian Splatting map with LiDAR data, our system is able to construct detailed and geometrically precise maps, addressing common issues such as excessive memory usage and imprecise geometry. This preparation makes our method well-suited for visual localization tasks, enabling efficient identification of correspondences between the query image and the rendered image from the Gaussian Splatting map via normalized cross-correlation (NCC). Additionally, we refine the camera pose of the query image using the Perspective-n-Point (PnP) technique.
Paper -
Autonomous vehicles heavily rely on LiDAR sensors for perception tasks, where accurate intensity information is essential. However, obtaining real-world LiDAR data with intensity information is challenging and expensive. As a result, simulation has emerged as a promising alternative. Existing physics-based simulation approaches often oversimplify the complex relationship between LiDAR rays and the objects they interact with, leading to a large simulation-to-real gap. Hence, bridging the gap between simulated and real-world LiDAR intensity data is crucial for developing robust LiDAR perception algorithms in autonomous vehicles.