(All videos at 1x speed)
Performing in-hand, contact-rich, and long-horizon dexterous manipulation remains an unsolved challenge in robotics. Prior hand dexterity works have considered each of these three challenges in isolation, yet do not combine these skills into a single, complex task. To further test the capabilities of dexterity, we propose drumming as a testbed for dexterous manipulation. Drumming naturally integrates all three challenges: it involves in-hand control for stabilizing and adjusting the drumstick with the fingers, contact-rich interaction through repeated striking of the drum surface, and long-horizon coordination when switching between drums and sustaining rhythmic play. Our key insight is leveraging contact-targeted rewards to address in-hand contacts (finger---stick) and external contacts (stick---drum). We instantiate this idea with DexDrummer, a dexterous drumming policy learned via reinforcement learning, which leverages minimal hand priors to balance stable in-hand control, and a contact curriculum to mitigate the early limitations imposed by external forces. In simulation, we show our policy can play two styles of music: multi-drum, bimanual songs and challenging, technical exercises that require increased dexterity. In real-world tasks, we show song performance across a multi-drum setup.
Arm-driven motion struggles with fine-grained control, often resulting in unnatural movements and compounding errors.Â
Finger-driven control is able to do fine-grained control.
Without contact curriculum, the policy can easily learn to rest on the drum and rely on it to play.
With contact curriculum, the policy learns to control the stick with fingers without interference from external forces, and later becomes able to handle external contact while completing the task.
Finger-driven control at 120 beats per minute, where the last three fingers are used to drag the stick downward for the upstroke.
Finger-driven control at 240 beats per minute, where the last three fingers are used to press the stick for a faster downstroke.
Fixed-grasp policy fails to adjust the stick after contact, requiring the arm to compensate with weird motions to hit the drum successfully.
Reactive-grasp policy is able to adjust the stick after contact, showing te necessity of reactive and closed-loop dexterous control.