AsymDex: Asymmetry and Relative Coordinates for
RL-based Bimanual Dexterity
Anonymous authors
Anonymous authors
We present Asymmetric Dexterity (AsymDex), a novel and simple reinforcement learning (RL) framework that can efficiently learn a large class of bimanual skills in multi-fingered hands without relying on demonstrations. Two crucial insights enable AsymDex to reduce the observation and action space dimensions and improve sample efficiency. First, true ambidexterity is rare in humans and most of us exhibit strong “handedness”. Inspired by this observation, we assign complementary roles to each hand: the facilitating hand repositions and reorients one object, while the dominant hand performs complex manipulations to achieve the desired result (e.g., opening a bottle cap, or pouring liquids). Second, controlling the relative motion between the hands is crucial for coordination and synchronization of the two hands. As such, we design relative observation and action spaces and leverage a relative-pose tracking controller. Further, we propose a two-phase decomposition in which AsymDex can be readily integrated with recent advances in grasp learning to facilitate both the acquisition and manipulation of objects using two hands. Unlike existing RL-based methods for bimanual dexterity with multi-fingered hands, which are either sample inefficient or tailored to a specific task, AsymDex can efficiently learn a wide variety of bimanual skills that exhibit asymmetry. Detailed experiments on seven asymmetric bimanual dexterous manipulation tasks (four simulated and three real-world) reveal that AsymDex consistently outperforms strong baselines that challenge our design choices.
AsymDex has two crucial ingredients, asymmetry and relative coordinates:
Asymmetry: we assign complementary roles to each hand: the facilitating hand repositions and reorients one object, while the dominant hand performs complex manipulations to achieve the desired result, for example, opening a bottle cap, or pouring liquids
Relative Coordinates: we design relative observation and action spaces and leverage a relative-pose tracking controller for better coordination and synchronization of the two hands.
We also leverage the observation that, bimanual manipulation in practice is composed of two distinct phases:
The acquisition phase in which objects are grasped from surfaces.
The interaction phase where the two hands coordinate to perform the bimanual task.
Unlike many existing methods that ignore the acquisition phase, we show that this decomposition enables AysmDex to be seamlessly integrated with learned grasping policies to enable fluent execution.
We evaluate AsymDex on 7 complex bimanual dexterous manipulation tasks, including 4 simulation tasks (block in cup, stack, bottle cap, and switch) from BiDexHands benchmark, and 3 real-world tasks (block in cup, pour, and twist lid) inspired by recent bimanual dexterous manipulation works. We also compare the training results with strong baselines that challenge our key design choices.
Block in cup
Pour
Twist Lid
Block in cup
Stack
Bottle cap
Switch
When used in isolation, neither asymmetry nor relative motion are sufficient. Both asymmetry and relative spaces are necessary for consistent performance.
Sym
A baseline that utilizes neither relative pose nor asymmetry.
Asym-w/o-rel
A baseline that only utilizes asymmetry without relative pose.
Rel-w/o-Asym
A baseline that only utilizes relative pose without asymmetry.
AsymDex
Our framework, which utilizes both asymmetry and relative pose.
This baseline uses a single policy to learn both the grasping and interaction phases for both hands
This baseline utilizes phase decomposition, but leverages neither asymmetry nor relative motion
Our method utilizes phase decomposition, and leverages both asymmetry and relative motion.
Sym
A baseline that utilizes neither relative pose nor asymmetry.
Asym-w/o-rel
A baseline that only utilizes asymmetry without relative pose.
Rel-w/o-Asym
A baseline that only utilizes relative pose without asymmetry.
AsymDex
Our framework, which utilizes both asymmetry and relative pose.
We deployed only the AsymDex policies on hardware since the baseline policies either resulted in negligible success rates or produced aggressive behaviors that could have damaged our hardware.