Terrain Traversability Prediction for Off-Road Autonomous Driving

We focus on the problem of estimating the traversability of terrain for autonomous off-road navigation. To produce dense and accurate local traversability maps, a robot must reason about geometric and semantic properties of the environment. To achieve this goal, we develop a novel Bird’s Eye View Network (BEVNet), a deep neural network that directly predicts dense traversability maps from sparse LiDAR inputs. BEVNet processes both geometric and semantic information in a temporally consistent fashion. More importantly, it uses learned prior and history to predict traversability in unseen space and into the future, allowing a robot to better appraise its situation. We quantitatively evaluate BEVNet on both on-road and off-road scenarios, and show that it outperforms a variety of strong baselines.

Comparison with LiDAR Segmentation with Temporal Aggregation

video1_kitti.mp4

[SemanticKITTI] Our approach only

video2_kitti.mp4

[SemanticKITTI] Comparison against baseline

video1_rellis.mp4

[RELLIS-3D] Our approach only

video2_rellis.mp4

[RELLIS-3D] Comparison against baseline

Compared against the baseline (Cylinder3D + Temporal Aggregation), BEVNet-R can learn to preserve small dynamic objects while preserving smoothness in the static region. Because BEVNet-R can predict future observations in a temporally consistent fashion, it outperforms the baseline on SemanticKITTI when evaluated on the full scene. BEVNet-R also outperforms the baseline on the offroad environment of RELLIS-3D, accurately predicting the BEV of the surrounding scene. Especially in RELLIS-3D, BEVNet-R outperforms other methods with a significant gap due to the lack of environmental structure in data.

Ablation study

From our ablation studies we observe that any form of recurrency, either Temporal Aggregation or ConvGRU, improves BEVNet's ability to predict the future. In particular, BEVNet-R outperforms all other approaches by a large margin. With noisy odometry the gap grows, because in comparison to Temporal Aggregation BEVNet-R uses learned recurrency to “fix” the errors in odometry and to adaptively forget history in case the error in odometry is too large.

video3_kitti.mp4

[SemanticKITTI] Learned Recurrency against basic Temporal Aggregation

video4_kitti.mp4

[SemanticKITTI] Noisy odometry comparison

video3_rellis.mp4

[RELLIS-3D] Learned Recurrency against basic Temporal Aggregation

video4_rellis.mp4

[RELLIS-3D] Noisy odometry comparison

Effect of Odometry Noise

We vary the odometry noise level by introducing a scalar λ such that the rotation noise is drawn from 𝒩(0, (0.01λ)²) and translation noise drawn from 𝒩(0, (0.1λ)²). In the following videos we show the prediction results of BEVNet-R (trained at 100% noise level) at different levels of odometry noise. BEVNet-R degrades gracefully as noise level increases.

kitti.mp4

[SemanticKITTI]

rellis.mp4

[RELLIS-3D]

Real Robot Experiments

Campus

campus_autonomous_2020_11_12_19_14_34.mp4

Canal Road

canal_ouster_2_2020_08_16_17_02_53.mp4

Weeds

We highlight the fact that the robot can traverse over dense vegetation that is categorized as the medium-cost group. While it is still traversable, it is naturally less preferable and is a potentially risky terrain to drive over. This additional level of traversability allows the robot to decide if it prefers to traverse through the medium-cost area to get to the target quickly or find an alternative route at a lower risk.

weeds2_2021_02_20_16_18_02.mp4

Using geometric features to determine traversability

The definition of traversability depends on the robot; vegetation with a certain height might be traversable for a large vehicle such as Warthog while it is not traversable for a smaller robot. With this in mind, we measure the height of bushes during the labeling process. Parts that are taller than a threshold H are marked as lethal obstacles while shorter parts are labeled as medium cost. The height is measured from the closest ground level to the bush.

We train 2 different networks with different thresholding: one where the medium cost class is kept whatever its height is, and one where the bush class taller than 0.5m above the ground estimate is determined non-traversable and hence remapped to lethal.

Traversability map for a small robot. Note that high bushes are predicted as non-traversable.

Traversability map for a large robot. Most bushes are considered traversable.