In our experiments, we investigate how real-world conditions can affect robot autonomy in heterogeneous environments. We deployed the EnvoDat dataset for four baseline applications - mapping, localisation, object detection, and classification We designed three experiments that address the following research questions (RQs):
RQ1 - Is the performance of the SOTA SLAM algorithms significantly degraded by environment-specific conditions e.g., dynamic entities, varying illumination, opaque surfaces, partial visibility conditions, etc.?
RQ2 - To what extent does feature density or sparsity affect robotic autonomy and perception in heterogeneous environments?
RQ3 - How do the heterogeneity of real-world environments and the viability of objects and terrain appearances, lighting conditions, non-standard objects, etc observed in the majority of the scenes in the EnvoDat affect the object detector models trained on common household, urban or controlled environment datasets?
We addressed RQ1 by benchmarking five SLAM algorithms - two visual-based (i.e., RTAB and ORBSLAM3), two graph-based LiDAR SLAM (i.e., HDL Graph SLAM and GLIM), and one filter-based LiDAR SLAM (i.e., FAST-LIO2) on EnvoDat. We evaluate their performance based on the following metrics:
Absolute Trajectory Error (ATE) Measures the global consistency of the entire trajectory.
Relative Pose Error (RPE) Measures the local consistency between poses over fixed-length segments of the trajectory.
Scale Drift (SD) Measures the deviation of the algorithm’s estimated scale of the environment to the ground truth scale over time.
We addressed RQ2 by evaluating the spatial distribution of feature points and correlate it with the per-point ATE and RPE. Figures (a-d) below shows the example correlation between the feature point distributions (clustered, sparse, and evenly distributed) and the per-point trajectory errors.
(a)
(c)
(b)
(d)
Correlation between feature point distributions (clustered, sparse, and evenly distributed) and per-point trajectory errors (ppATE and ppRPE). The top section shows feature density in the reconstructed map, with robot trajectories coloured by distance (blue - start and red - end). The bottom section shows the correlation between feature density and trajectory errors, with colour intensity representing error frequency. The figures showed a complex non-linear relationship between the errors and the feature densities, contrary to the common assumption that more features improve perception and SLAM accuracy.
We address RQ3 using the EnvoDat dataset by evaluating three pre-trained object detector models: YOLOv8, Fast R-CNN, and Detectron2. We trained these models on the annotated RGB images drawn across all the scenes in the EnvoDat dataset. Their performances are summarised in the table below.
Labels
Precision Recall curve
Labels correlogram
Training results
Train batch 0
Train batch 34201
Val_batch 1 labels
Train batch 2
Val_batch 0 labels
Val_batch 2 labels
Train batch 34200
Val_batch 0 prediction
Val_batch 2 prediction