Latent Action Priors for Locomotion with Deep Reinforcement Learning
Oliver Hausdörfer, Alexander von Rohr, Eric Lefort, Angela P. Schoellig
Code [GitHub] Paper [arXiv]
Abstract. Deep Reinforcement Learning (DRL) enables robots to learn complex behaviors through interaction with the environment. However, due to the unrestricted nature of the learning algorithms, the solutions are often brittle and appear unnatural. This is especially true for learning direct torque control, as inductive biases are more difficult to facilitate than for position control. We propose an inductive bias for learning locomotion that is especially useful for torque control: latent actions learned from a small dataset of expert demonstrations. Combining latent action priors with one style reward term for imitation leads to desirable locomotion patterns. Despite using only little demonstration data, the agent is not restricted to the reward levels of the expert demonstration, and we observe significant improvements in transfer tasks.