Env-Mani: Quadrupedal Robot Loco-Manipulation with Environment-in-the-Loop
Yixuan Li, Zan Wang, Wei Liang
Beijing Institute of Technology,
Yangtze Delta Region Academy of Beijing Institute of Technology
Env-Mani: Quadrupedal Robot Loco-Manipulation with Environment-in-the-Loop
Yixuan Li, Zan Wang, Wei Liang
Beijing Institute of Technology,
Yangtze Delta Region Academy of Beijing Institute of Technology
While dogs can retrieve objects from elevated surfaces by leveraging the external environment, such as climbing onto tables with support from their front legs, the ability of quadrupedal robots to perform similar tasks remains largely unexplored. In this work, we introduce a learning-based loco-manipulation system for quadrupedal robots, enabling them to use the external environment as support to extend their workspace and enhance their manipulation capabilities. Our unified framework allows legged robots to perform animal-like movement skills by leveraging visual input and proprioception while guiding their end-effectors to accurately track target positions. Additionally, we employ an elaborately designed curriculum learning policy to advance the system’s training process. To the best of our knowledge, this is the first work that enables quadrupedal robots to climb onto elevated surfaces and manipulate objects by leveraging environmental support. We train the policy in simulation and conduct extensive experiments, demonstrating that our approach allows robots to manipulate previously inaccessible objects. Our work provides new probabilities for enhancing quadrupedal robot capabilities without requiring hardware modifications or additional costs.
Overview
We propose a data-driven reinforcement learning (RL) framework that enables quadrupedal robots to perform dog-like manipulation skills. Our framework takes only depth sensing and proprioception as input and predicts the robots’ actions directly. The policy gives the target joints’ positions and converts the target positions of the joints into joint torques, enabling the robot to execute the commands. To facilitate the policy learning process, we adopt a teacher-student learning scheme for this policy. Both policies are trained in a calibrated simulation environment using a curriculum learning strategy. The student policy, which relies only on depth image, makes learning complex skills from the raw simulation environment hard. To assist the student policy learning, we train the teacher policy with access to the privileged 3D perception of the environment, such as scandot, to supervise the student policy learning. To encourage the robot to perform precise, animal-like movements, we designed a set of reward functions that drive the high-level policy to output accurate actions.
Demo