Learning Risk-Aware Quadrupedal Locomotion using Distributional Reinforcement Learning
Lukas Schneider, Jonas Frey, Takahiro Miki, Marco Hutter
International Conference on Robotics and Automation (ICRA) 2024
Lukas Schneider, Jonas Frey, Takahiro Miki, Marco Hutter
International Conference on Robotics and Automation (ICRA) 2024
Our robot learns to adapt its locomotion behavior in risky situations. When commanded to walk up a large step, the risk-averse policy (◼) refuses while the risk-seeking policy (◼) complies. The risk parameter, controlling the value distribution distortion, can be adapted online during deployment. Right: Respective risk-metric distorted value distribution per robot.
@INPROCEEDINGS{Schneider-ICRA-24,
AUTHOR = {Lukas Schneider and Jonas Frey and Takahiro Miki and Marco Hutter},
TITLE = {Learning Risk-Aware Quadrupedal Locomotion using Distributional Reinforcement Learning},
BOOKTITLE = {2024 International Conference on Robotics and Automation (ICRA)},
YEAR = {2024}
}
ANYmal was commanded forward towards boxes of varying height. Different behaviors arose depending on the risk sensitivity and step height.
We visualize the return distributions arising from different risk sensitivities in simulation.
Supplementary Video