Reinforcement learning

Although RL is a learning strategy instead of a data model, it dramatically influences the data requirement and control performance. Therefore, papers involving RL are summarized here. In general, RL copes with high-level tasks by exploring the environment and exploiting data collected in the exploring process. This strategy trains an agent for complicated tasks and requires a long learning time.

Before NN is normally used as the common agent in RL, Q-table and statistical models are viable choices. With the help of NN, RL does not only achieve simple tasks like position reach and trajectory following but also aims to solve some complex issues like gait design. Various unique RL strategies, like actuation and state space discretization, are applied to take advantage of soft robots and cope with some modeling and control difficulties. One of the most considerable challenges of RL is exploring the real world, which has a high time cost and may damage robots. Therefore simulation is utilized in the training process.

Compared with other methods, RL requires the most enormous amount of data. More critically, a predefined agent and interaction with the environment are necessary. However, following such a high cost, RL is able to fulfill complex and high-level tasks. With some adaptation strategies like discretization and simulation transfer learning, the time and resource costs can be reduced to some extent.

Table 5. Reinforcement Learning Paper Comparison.

Papae list:

05YE: Engel Y, Szabo P, Volkinshtein D. Learning to control an octopus arm with gaussian process temporal difference methods[J]. Advances in neural information processing systems, 2005, 18.

A GPR model named the Gaussian process temporal difference method is employed to control a soft arm.

13YK: Kassahun Y, Yu B, Vander Poorten E. Learning catheter-aorta interaction model using joint probability densities[C]//Proceedings of the 3rd joint workshop on new technologies for computer/robot assisted surgery. 2013: 158-160.

As the RL agent, a GMM is trained to estimate robot shape and contact.

14AT: A. T. Tibebu, B. Yu, Y. Kassahun, E. Vander Poorten, and P. T. Tran, “Towards autonomous robotic catheter navigation using reinforcement learning,” 4th Joint Workshop on New Technologies for Computer/Robot Assisted Surgery, 2014.

For robotic catheter control inside a narrow tube, a joint probability distribution is learned considering various variables like tip and entrance points, touch state, and action.

21HJ: Jiang H, Wang Z, Jin Y, et al. Hierarchical control of soft manipulators towards unstructured interactions[J]. The International Journal of Robotics Research, 2021, 40(1): 411-434.

This paper exploits Q-learning for many sophisticated control tasks like turning a handwheel, unscrewing a bottle cap, drawing a line with a ruler, etc.

21XL: Liu X, Onal C, Fu J. Learning contact-aware cpg-based locomotion in a soft snake robot[J]. arXiv preprint arXiv:2105.04608, 2021.

A soft snake robot is controlled to move on the ground, arrive at target positions, and avoid obstacles with an NN as the agent.

23YL: Lu Y, Wei R, Li B, et al. Autonomous intelligent navigation for flexible endoscopy using monocular depth guidance and 3-D shape planning[C]//2023 IEEE international conference on robotics and automation (ICRA). IEEE, 2023: 1-7.

The authors fuse the visual and shape information with NN in RL and control a flexible endoscopy to navigate.

19SS: Satheeshbabu S, Uppalapati N K, Chowdhary G, et al. Open loop position control of soft continuum arm using deep reinforcement learning[C]//2019 International Conference on Robotics and Automation (ICRA). IEEE, 2019: 5133-5139.

In RL, the workspace is discretized into a 3D grid with a resolution of 0.01 m.

22YG: Gan Y, Li P, Jiang H, et al. A Reinforcement Learning Method for Motion Control With Constraints on an HPN Arm[J]. IEEE Robotics and Automation Letters, 2022, 7(4): 12006-12013.

The soft robot in this work is able to keep the end position invariant while changing the orientation with the help of RL.

22YL: Li Y, Wang X, Kwok K W. Towards adaptive continuous control of soft robotic manipulator using reinforcement learning[C]//2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2022: 7074-7081.

Constant curvature (CC), a soft robot modeling method, provides a simulation environment at first, and the NN agent continues to learn in the real world using the Deep Deterministic Policy Gradient method (DDPG).

22AC: Centurelli A, Arleo L, Rizzo A, et al. Closed-loop dynamic control of a soft manipulator using deep reinforcement learning[J]. IEEE Robotics and Automation Letters, 2022, 7(2): 4741-4748.

LSTM is utilized for forward modeling of segmented pneumatic robots. Then, RL agents are trained and validated in reality.

Page updated

Google Sites

Report abuse