Collaborative motion planning for multi-manipulator systems through
Reinforcement Learning and Dynamic Movement Primitives
Siddharth Singh*, Tian Xu* and Qing (Cindy) Chang
Intelligent Systems Lab
[Under Review]
Abstract
Robotic tasks often require multiple manipulators to enhance task efficiency and speed, but this increases complexity in terms of collaboration, collision avoidance, and the expanded state-action space. To address these challenges, we propose a multi-level approach combining Reinforcement Learning (RL) and Dynamic Movement Primitives (DMP) to generate adaptive, real-time trajectories for new tasks in dynamic environments using a demonstration library. This method ensures collision-free trajectory generation and efficient collaborative motion planning. We validate the approach through experiments in the PyBullet simulation environment with UR5e robotic manipulators.
Proposed Method
This work introduces a new method that uses a single human demonstration to help multi-arm robots work together by generating real-time paths. Our approach uses Dynamic Movement Primitives (DMPs) and a two-level structure. The top level uses a set of pre-recorded human movements to generate a path for each arm, while the lower level focuses on smooth teamwork and collision avoidance during execution. We also introduce an optimization process to adjust key parameters and a new way to handle robot movements based on the position of the arm's end. A simple rule-based method ensures the robots can cooperate effectively in real time. We call this improved method ONCol-DMP.
Our key contribution is combining skill learning with real-time path planning to allow robots to move efficiently and avoid obstacles while working together.
Experiments
Avoiding collision amongst arms
Comparison of the DMP trajectories for end-effector with the improvised ONCol DMP (top row) v/s without collaborative phase control term (bottom row).
Distinct Tasks
Block Stacking
PyBullet setup showing collaborative stacking of blocks with two arms.
Trajectory evolution of the two end-effectors with time. (Top row) The (x,y,z) position of the end-effector after three equal intervals. (Bottom three rows) Plot of (x,y,z) position v/s time (s) for the two end-effectors.