Scaffolder
Scaffolder on pre-existing benchmark
Sensory Scaffolding Suite
Blind Pick
Blind Locomotion
Blind and Deaf Piano
Blind and Numb Pen Rotation
Blind and Numb Cube Rotation
Noisy Monkey Bars
Wrist Camera Pick-and-Place
Occluded Pick-and-Place
Visual Pen Rotation
Visual Cube Rotation
Blind Pick
A two-fingered robot arm must pick up a randomly initialized block using only proprioception (gripper position) and touch sensing. During training, it can use multiple privileged RGB camera inputs (two static cameras and one wrist camera).
Scaffolder
Scaffolder learns to do spiral search and robustly picks up the block even with perturbations (shown to the left and above).
DreamerV3+BC & Guided Observability
Baselines move erratically and frequently push the block far from the workspace or search in unfeasible areas.
Blind Locomotion
A proprioceptive policy must run and jump over randomly sized and positioned hurdles. During training, privileged RGB images of the agent and its nearby hurdles are given, allowing the agent to see obstacles before bumping into them.
Scaffolder
Scaffolder bounds to minimize the risk of hitting obstacles and quickly recovers when it does.
Guided Observability
Guided Observability moves quickly in unobstructed areas but is frequently stopped or slowed by hurdles.
DreamerV3
DreamerV3 moves slowly and is easily caught by obstacles.
Blind and Deaf Piano
A pair of 30-DoF Shadowhands must learn to play “Twinkle Twinkle Little Star” using only proprioception in the Robopianist simulator. At training time, the policy has access to future notes, piano key presses, and suggested fingerings, which emulates having vision to see sheet music and hearing to determine which keys were pressed.
Scaffolder
Scaffolder plays an imperfect but recognizable rendition of Twinkle Twinkle Little Star
DreamerV3
DreamerV3's rendition is unrecognizable.
Blind and Numb Pen Rotation
A proprioceptive policy must control a 30-DoF Shadowhand to rotate a pen from a randomized initial orientation to a randomized desired orientation using only proprioception and the initial and target pose of the pen. During training, the policy has access to the object pose and touch sensors.
Scaffolder
Scaffolder quickly achieves the desired pen orientation and maintains stability.
DreamerV3
DreamerV3 briefly achieves the desired pen orientation but struggles to maintain stability.
Blind and Numb Cube Rotation
Similar to Blind and Numb Pen Rotation, a proprioceptive policy must control a 30-DoF Shadowhand to rotate a cube from a randomized initial orientation to a randomized desired orientation.
Scaffolder
Scaffolder quickly rotates the block to achieve the desired orientation and maintains stability once reaching it.
Informed Dreamer
Informed dreamer is able to rotate the block rather quickly but struggles to maintain stability.
Noisy Monkey Bars
A 13-link gibbon must swing between a fixed set of handholds in a 2D environment using the brachiation simulator. To simulate imperfect sensors on a robotic platform, noise is added to the target observations, while privileged observations represent true simulator states.
Scaffolder
Guided Observability
DreamerV3
Informed Dreamer
Both Scaffolder and Guided Observability learn efficient, swinging brachiating motions that closely mimic that of a Gibbon.
DreamerV3 and Informed Dreamer can swing between the handholds but use inefficient motions that incur large energy costs.
Wrist Camera Pick-and-Place
A proprioceptive policy with an active-perception wrist camera must pick and place a randomly positioned block into a randomly positioned bin. During training, the policy has access to multiple informative, statically-placed cameras.
Scaffolder (Wrist Camera View)
Scaffolder (Privileged Camera View)
DreamerV3 (Wrist Camera View)
DreamerV3 (Behind Camera View)
Scaffolder quickly locates the block and goal location with its active-perception wrist camera. DreamerV3 locates the block relatively quickly but struggles to find the goal location.
Occluded Pick-and-Place
Rather than employing active visual perception at test time, this task examines the impact of active-perception as privileged sensing at training time. The target policy must use an RGB camera from an occluded viewpoint, alongside proprioception and touch sensing, to pick up a block behind a shelf and place it into a bin. Both block and bin locations are randomly initialized.
Scaffolder
Scaffolder quickly locates the block with touch sensing and places it on the goal.
DreamerV3 + BC
DreamerV3+BC locates the block but fails to fully pick it up and even attempts to go to the goal location long after dropping it.
Guided Observability
Guided Observability locates the block but fails to fully place it on the goal on its first attempt, as its picking is less robust.
Visual Pen Rotation
Similar to Blind and Numb Pen Rotation, a 30-DoF Shadowhand must rotate a pen from a randomized initial orientation to a randomized desired orientation. The privileged observations are identical to the blind and numb rotation tasks, but the target policy additionally has access to a top-down RGB camera.
Scaffolder
As with blind pen rotation, Scaffolder quickly achieves the desired orientation and maintains stability.
Informed Dreamer
Informed Dreamer initially achieves the desired orientation but struggles to maintain stability.
Visual Cube Rotation
Similar to Visual Pen Rotation, a visual target policy conditioned on a top-down RGB image and proprioception must rotate a block to a desired orientation.
Scaffolder
DreamerV3+BC
Guided Observability
Both Scaffolder and DreamerV3+BC are able to quickly rotate the block, while Guided Observability fails to rotate the block at all.