Advanced Skills by
Learning Locomotion and Local Navigation End-to-End
Nikita Rudin, David Hoeller, Marko Bjelonic and Marco Hutter
Robotic Systems Lab, ETH Zurich & NVIDIA
Nikita Rudin, David Hoeller, Marko Bjelonic and Marco Hutter
Robotic Systems Lab, ETH Zurich & NVIDIA
Paper: arXiv
Local navigation and locomotion of legged robots are commonly split into separate modules.
In this work, we propose to combine them by training an end-to-end policy with deep reinforcement learning.
Training a policy in this way opens up a larger set of possible solutions, which allows the robot to learn more complex behaviors.
The locomotion controller is usually tasked with accurately tracking a commanded velocity. However, this limit the robot's capabilities.
With our approach, the robot needs to reach a target position within a provided time. The task's success is only evaluated at the end of an episode, meaning that the policy does not need to reach the target as fast as possible. It is free to select its path and the locomotion gait.
This simple change enables the training of new behaviours
Jumping over gaps
Climbing on boxes
Navigating obstacles
Climbing steep slopes
We transfer all policies to the real robot
Policies trained with velocity tracking (V) usually end-up learning a trotting gait.
Our policies (Pf) learn an interesting three-phased gait, which proves to be more energy efficient than trotting