Agents are trained with 3 objects and evaluated on increasing numbers of objects.
Current Observation (Bottom) | Generated Subgoal (Middle) | Goal (Top)
Front View (Left) | Side View (Right)
Current Observation (Bottom) | Generated Subgoal (Middle) | Goal (Top)
Current Observation (Bottom) | Goal (Top)
Current Observation (Bottom) | Goal (Top)