Visual imitation learning has achieved impressive progress in learning unimanual manipulation tasks from a small set of visual observations, thanks to the latest advances in computer vision. However, learning bimanual coordination strategies and complex object relations from bimanual visual demonstrations, as well as generalizing them to categorical objects in novel cluttered scenes remain unsolved challenges. In this paper, we extend our previous work on keypoints-based visual imitation learning K-VIL to bimanual manipulation tasks. The proposed Bi-KVIL jointly extracts so-called Hybrid Master-Slave Relationships (HMSR) among objects and hands, bimanual coordination strategies, and sub-symbolic task representations. Our bimanual task representation is object-centric, embodiment-independent, and viewpoint-invariant, thus generalizing well to categorical objects in novel scenes. We evaluate our approach in various real-world applications, showcasing its ability to learn fine-grained bimanual manipulation tasks from a small number of human demonstration videos.
Geometric Constraints
Comparison to K-VIL
K-VIL
unimanual
single-master slave pair
human demonstration videos of unimanual pouring
extracted task representation including
local frame, p2p, and p2c constraints
Single MSR
Reproduction with KAC
Bi-KVIL
bimanual
hybrid-master slave graph
human demonstration videos of bimanual pouring
extracted task representation including
local frame, p2p, p2c, and pose constraints
Hybrid MSR
Reproduction with Bi-KAC
Plsp_1 (6): the plate moves to the initial position of the spoon head while keeping the spoon head right on top of the center of the plate.
Loosely-coupled/Asymmetric, left-dominant
Human demonstration videos
HMSR
Plsp_2 (6): Similar to PS_1(6), except the plates may start from different positions above the table.
Loosely-coupled/Asymmetric, left-dominant
Task reproduction with categorical objects in novel cluttered scenes and with ARMAR-6 robot.
Plsp_3 (6): Similar to PS_1 (6), except that the spoon may be place anywhere on the plate. Note that the plate still moves to the initial position of the spoon head in this task.
Loosely-coupled/Asymmetric, left-dominant
Plsp_4 (6): the plate moves to anywhere on the table while keeping the spoon head right on top of the center of the plate.
Loosely-coupled/Asymmetric, left-dominant
Plsp_5 (6): place the spoon head on the center fo the plate with only one arm.
Uncoordinated unimanual
Pow (8) with an upright cup
Loosely-coupled/Asymmetric, right-dominant
Human Demonstration Videos
HMSR
Task reproduction and details
The RGB image and correspondence detection of DON
perceived point cloud and TCP poses
The p2p constraint
The p2c constraint
The execution status of both hands
Pow (8) with a tilt cup
Loosely-coupled/Asymmetric, right-dominant
Pow (8) with a tilt cup
Loosely-coupled/Asymmetric, right-dominant
Pow (8) cup is taken from a far position
Loosely-coupled/Asymmetric, right-dominant
Pow (8) with multiple cup
Loosely-coupled/Asymmetric, right-dominant
place the plate right above the center of the tablemat, while placing the spoon right above the center of the plate
Plsp,pt (6), with the plates and spoons starting from arbitrary positions
Loosely-coupled/Asymmetric, left-dominant
Plcb,pa (6): Transport the cutting board to the center of the pan while placing the pan at the center of a potmat.
Loosely-coupled/Asymmetric, left-dominant
Plcb,pa (8)
Loosely-coupled/Asymmetric, left-dominant
Plst (6): the serving trays start from an arbitrary position above the table and end at the center of the tablemat.
Tightly-coupled Symmetric
Plst (6): the serving trays start from an arbitrary position above the table and end at an arbitrary position on the tablemat.
Tightly-coupled Symmetric
Plsp,ba (6): the left places the spoon on the plate while the right-hand places the banana on the tablemat. Both arms have no coordination.
Uncoordinated Bimnanual
CT (6): The right arm moves the brush, which is constrained by two p2l constraints.
Loosely-coupled/Asymmetric, right-dominant
Demonstrations
HMSR
the 1st p2l constraint (green)
The 2nd p2l constraint (blue)