EC-Diffuser: Multi-Object Manipulation via Entity-Centric Behavior Generation

FrankaKitchen

The agent is required to complete a set of 4 out of 7 possible tasks in a kitchen environment. The official documentation can be found here. We use the goal-conditioned image-based variant, where the environment is perceived from a single view and the goal is specified by the last image in the demonstration trajectory.

Below, we visualize the trajectory rollouts of our method completing all 4 subtasks specified by the goal (image).

Goal: {top burner, bottom burner, slide cabinet, hinge cabinet}

Goal: {microwave, bottom burner, kettle, hinge cabinet}

Goal: {microwave, stove light, slide cabinet, bottom burner}

We also visualize the top-16 particles (based on transparency) along with the execution

Google Sites

Report abuse