I have been working on developing autnomous vehicle technology for over a decade now. Currently, I am exploring the possibility of using recently discovered deep learning techniques to solve some of the challenging problems faced in autonomous vehicles. It is not possible to pre-program the autonomous vehicle for all possible situations that may arise on the road. So autonomous vehicles will have to be intelligent enough to learn from what they observe, similar to the way we humans do. We will solve some of these challenging problems using fused sensor data and deep neural networks. We can use the fused multi-modal sensor data to develop an unsupervised learning algorithm using deep neural networks to detect and classify objects that an autonomous vehicle observes in an urban environment. Detection and classification of objects is very important for the successful operation of an autonomous vehicle. For example, it is important to distinguish if an object protruding from the ground plane on the pavement next to the car is a fire hydrant or a child, because the response of the vehicle may change based on this information.
We have developed and innovative multi-task learning framework that allows concurrent depth estimation and semantic segmentation using a single camera. The proposed approach is based on a shared encoder-decoder architecture, which integrates various techniques to improve the accuracy of the depth estimation and semantic segmentation task without compromising computational efficiency. Additionally, it incorporates an adversarial training component, employing a Wasserstein GAN framework with a critic network, to refine model's predictions.
Paper - Accepted in IROS 2024 !
We have developed a novel system designed for 3D mapping and visual localization using 3D Gaussian Splatting. Our proposed method uses LiDAR and camera data to create accurate and visually plausible representations of the environment. By initiating the training of the 3D Gaussian Splatting map with LiDAR data, our system is able to construct detailed and geometrically precise maps, addressing common issues such as excessive memory usage and imprecise geometry. This preparation makes our method well-suited for visual localization tasks, enabling efficient identification of correspondences between the query image and the rendered image from the Gaussian Splatting map via normalized cross-correlation (NCC). Additionally, we refine the camera pose of the query image using the Perspective-n-Point (PnP) technique.
Autonomous vehicles heavily rely on LiDAR sensors for perception tasks, where accurate intensity information is essential. However, obtaining real-world LiDAR data with intensity information is challenging and expensive. As a result, simulation has emerged as a promising alternative. Existing physics-based simulation approaches often oversimplify the complex relationship between LiDAR rays and the objects they interact with, leading to a large simulation-to-real gap. Hence, bridging the gap between simulated and real-world LiDAR intensity data is crucial for developing robust LiDAR perception algorithms in autonomous vehicles.