Fast Trajectory Planner with a Reinforcement Learning-Based Controller for Robotic Manipulators
Yongliang Wang, Hamidreza Kasaei
Department of Artificial Intelligence, Bernoulli Institute, University of Groningen
Yongliang Wang, Hamidreza Kasaei
Department of Artificial Intelligence, Bernoulli Institute, University of Groningen
The ability to quickly generate obstacle avoidance trajectories in unstructured and obstructed environments remains a significant challenge for robotic manipulators. This paper highlights the strong potential of model-free reinforcement learning methods over model-based approaches for obstacle-free trajectory planning in joint space. We propose a fast trajectory planning system for manipulators that integrates vision-based path planning in task space with reinforcement learning-based obstacle avoidance in joint space. The paper introduces enhancements to the Proximal Policy Optimization (PPO) algorithm, including Action Ensembles (AE) and Policy Feedback (PF), which significantly improve precision and stability for goal-reaching and obstacle avoidance in joint space. These enhancements make PPO more adaptable to a variety of robotic tasks, thereby boosting performance. Additionally, we have integrated the Fast Segment Anything (FSA) with B-spline optimized kinodynamic path searching to develop a vision-based trajectory planner in task space. Experimental results demonstrated the effectiveness of PPO enhancements, Sim-to-Sim transfer for model robustness, and planner efficiency in complex scenarios. These enhancements allowed the robot to perform obstacle avoidance and real-time trajectory planning in obstructed environments.Â
We aim to enhance the capability of manipulators for safe and efficient motion planning in environments with both static and dynamic obstacles. The primary contribution is the development of an integrated vision-based trajectory planner coupled with an enhanced RL Joint Space Controller, enabling manipulators to achieve goal-oriented motion with obstacle avoidance.Â
We initially discussed vision-based trajectory planning. Subsequently, it outlines the construction of an improved PPO and its role in enhancing the performance of the reaching task with obstacle avoidance for manipulators. Finally, these elements are combined to propose a fast trajectory planner.
Vision-based Trajectory Planning in Task Space
RL-based Joint Space Controller for Obstacle Avoidance
PPO with various AE methods
PPO with PF method
PPO performance with both
PPO_PF_AEL with different alpha
PPO_PF_AEP with different alpha
PPO_PF_AEB with different alpha
PPO_PF_AEE with different alpha
PPO_PF_AEW with different alpha
Comparison result of accumulated reward utilizing 5 random seeds for our method and other baselines.
Comparison of success rate on reaching task with obstacles
Comparison result of training time for different methods
(*The videos show a single random trial, while the statistics are derived from averages across 100 trials.)
Task 1: Except head and body no other obstacles
Task 2: Except head and body with two obstacles
Task 3: An environment cluttered with obstacles
Task 4: A moving obstacle near the goal
Task 5: The goal is changed