Collaborative motion planning for multi-manipulator systems through

Reinforcement Learning and Dynamic Movement Primitives

Siddharth Singh*, Tian Xu* and Qing (Cindy) Chang

Intelligent Systems Lab

[Under Review]

 

Abstract

Robotic tasks often require multiple manipulators to enhance task efficiency and speed, but this increases complexity in terms of collaboration, collision avoidance, and the expanded state-action space. To address these challenges, we propose a multi-level approach combining Reinforcement Learning (RL) and Dynamic Movement Primitives (DMP) to generate adaptive, real-time trajectories for new tasks in dynamic environments using a demonstration library. This method ensures collision-free trajectory generation and efficient collaborative motion planning. We validate the approach through experiments in the PyBullet simulation environment with UR5e robotic manipulators.

 Proposed Method

Overview of the proposed approach. The higher level, utilizing Q-Learning, generates independent motion plans. The proposed Collab-DMPensures collision avoidance and can also control the sequence of the operation

This work introduces a new method that uses a single human demonstration to help multi-arm robots work together by generating real-time paths. Our approach uses Dynamic Movement Primitives (DMPs) and a two-level structure. The top level uses a set of pre-recorded human movements to generate a path for each arm, while the lower level focuses on smooth teamwork and collision avoidance during execution. We also introduce an optimization process to adjust key parameters and a new way to handle robot movements based on the position of the arm's end. A simple rule-based method ensures the robots can cooperate effectively in real time. We call this improved method ONCol-DMP.

Our key contribution is combining skill learning with real-time path planning to allow robots to move efficiently and avoid obstacles while working together.


Experiments

Avoiding collision amongst arms

Setup in PyBullet environment for crossing arms. Top row shows the trajectory for the case with the proposed ONColDMP, whereas bottom row shows traditional DMP without collaborative term.

Comparison of the DMP trajectories for end-effector with the improvised ONCol DMP (top row) v/s without collaborative phase control term (bottom row).

Distinct Tasks

Block Stacking

PyBullet setup showing collaborative stacking of blocks with two arms.

Trajectory evolution of the two end-effectors with time. (Top row) The (x,y,z) position of the end-effector after three equal intervals. (Bottom three rows) Plot of (x,y,z) position v/s time (s) for the two end-effectors.