SafeSteps: Learning Safer Footstep Planning Policies for Legged Robots via Model-Based Priors

Abstract

We present a footstep planning policy for quadrupedal locomotion that is able to directly take into consideration a-priori safety information in its decisions. At its core, a learning process analyzes terrain patches, classifying each landing location by its kinematic feasibility, shin collision, and terrain roughness. This information is then encoded into a small vector representation and passed as an additional state to the footstep planning policy, which furthermore proposes only safe footstep location by applying a masked variant of the Proximal Policy Optimization (PPO) algorithm. The performance of the proposed approach is shown by comparative simulations and experiments on an electric quadruped robot walking in different rough terrain scenarios. We show that violations of the above safety conditions are greatly reduced both during training and the successive deployment of the policy, resulting in an inherently safer footstep planner. Furthermore, we show how, as a byproduct, fewer reward terms are needed to shape the behavior of the policy, which in return is able to achieve both better final performances and sample efficiency.

Safe Foothold Evaluation Criteria

Kinematic Reachability

We compare policies trained using naive reinforcement learning and our approach to highlight that our footstep planning policy chooses footholds that are safe to reach without reaching kinematic joint limits.

Shin Collisions

Next, we compare policies trained using naive reinforcement learning and our approach to highlight that our footstep planning policy chooses footholds that avoid collision of the shin with the terrain.

Terrain Roughness

Next, we compare the two policies to highlight that our footstep planning policy chooses footholds that avoids edges in the terrain that could potentially lead to foot slippage.

Reward-driven controller

Learning Visual Foothold Encoding and Classification

Learning Footstep Planning Policy

Adaptation to strong pushes/disturbances

Flat Terrain

Both policies, trained using the naive approach and our approach, are able to withstand disturbances on flat ground compared to the baseline

Rough Terrain

The policy trained using the naive approach fails to survive with strong pushes on rough terrain. Whereas, our policy with the embedded safety information, is able to carefully choose footholds that are safe

Foothold Adaptation Visualization

Stair Climbing

Blocks

This section helps to visualize the action (footstep location) taken by the policy from the safety map generated by the SaFE-Net (left) and the corresponding behaviour of the robot controlled by the MPC (right). Compared the the baseline (green) which chooses the closest safe footstep to the center of the heightmap, the learnt policy chooses farther safe footholds in the direction of the push to keep the robot's integrity intact.
The learnt policy also exhibits new behaviour on stairs by skipping a stair while climbing in order to cope with the external pushes.

Sim-to-Sim (RaiSim-to-Gazebo)

We deployed our policy, which was trained in RaiSim, on Gazebo using domain randomization techniques and evaluated pushing the robot on pallets that are arranged like an irregular grid. 

Experiments on AlienGo

Trotting Over Obstacles

Pulling while Trotting Over Obstacles

More Simulation Views

Resilience through adaptation by RL-VFA when pushed with 35N force in X and Y (indicated by blue and red arrows) on blocks

Acknowledgments

This research is supported by and in collaboration with the Italian National Institute for Insurance against Accidents at Work (INAIL), under the project “Sistemi Cibernetici Collaborativi - Robot Teleoperativo 2”.

BibTex Citation

@inproceedings{omar23humanoids,

  author = {Omar, Shafeef and Amatucci, Lorenzo and Turrisi, Giulio and Barasuol, Victor. and Semini, Claudio},

  title = {SafeSteps: Learning Safer Footstep Planning Policies for Legged Robots via Model-Based Priors},

  booktitle = {IEEE-RAS International Conference on Humanoid Robots},

  year = {2023},

}

Questions?

Contact shafeef [dot] omar [at] iit [dot] it for more information