DexPBT: Scaling up Dexterous Manipulation for Hand-Arm Systems with Population Based Training
Aleksei Petrenko, Arthur Allshire, Gavriel State, Ankur Handa, Viktor Makoviychuk
Links
Abstract
In this work we propose algorithms and methods that enable learning dexterous object manipulation using simulated one- or two-armed robots equipped with multi-fingered hand end-effectors. Using a parallel GPU-accelerated physics simulator (Isaac Gym), we implement challenging tasks for these robots including regrasping, grasp-and-throw, and object reorientation. To solve these problems we introduce a decentralized Population-Based Training (PBT) algorithm that allows us to massively amplify the exploration capabilities of deep reinforcement learning. We find that this method significantly outperforms regular end-to-end learning and is able to discover robust control policies even for the most challenging tasks.
Citation
@article{petrenko2023dexpbt,
author = {Aleksei Petrenko and
Arthur Allshire and
Gavriel State and
Ankur Handa and
Viktor Makoviychuk},
title = {DexPBT: Scaling up Dexterous Manipulation for Hand-Arm Systems with Population Based Training},
journal = {CoRR},
volume = {abs/2305.12127},
year = {2023},
url = {https://doi.org/10.48550/arXiv.2305.12127},
eprinttype = {arXiv},
eprint = {2305.12127},
}
Supplementary Video
Additional Agent Demonstrations
Single-Arm Tasks
Dual-Arm Tasks
Reorientation - Alternative Behavior
This policy, trained with Population-Based Training with 16 agents discovers an alternative strategy. It avoids some in-hand rotations and prefers to put the object back on the table to rotate it there without any risk of dropping the object.