SMORL

Self-supervised Visual Reinforcement Learning with Object-centric Representations

Andrii Zadaianchuk, Maximilian Seitzer, Georg Martius

Max Planck Institute for Intelligent Systems and ETH Zürich

In ICLR 2021 (Spotlight)

Abstract

Autonomous agents need large repertoires of skills to act reasonably on new tasks that they have not seen before. However, acquiring these skills using only a stream of high-dimensional, unstructured, and unlabeled observations is a tricky challenge for any autonomous agent. Previous methods have used variational autoencoders to encode a scene into a low-dimensional vector that can be used as a goal for an agent to discover new skills. Nevertheless, in compositional/multi-object environments it is difficult to disentangle all the factors of variation into such a fixed-length representation of the whole scene. We propose to use object-centric representations as a modular and structured observation space, which is learned with a compositional generative world model. We show that the structure in the representations in combination with goal-conditioned attention policies helps the autonomous agent to discover and learn useful skills. These skills can be further combined to address compositional tasks like the manipulation of several different objects.

Qualitative results on Rearrange and Visual Rearrange environments

Here, we present qualitative results for Self-supervised Multi-Object RL (SMORL) model. SMORL uses object-centric representation for the structured sampling of new goals in the latent space and learns to achieve these goals with goal-conditioned RL.

Each video shows several episodes of the learned policy behaviour (upper part of the video) with test goal images (lower part of the video).

SMORL with GT representations

smorl_gt_2_objects.avi

2 objects

smorl_gt_3_objects.avi

3 objects

smorl_gt_4_objects.avi

4 objects

SMORL with SCALOR representations

smorl_scalor_1_object.avi

1 object

smorl_scalor_2_objects.mp4

2 objects

Page updated

Google Sites

Report abuse

Self-supervised Visual Reinforcement Learning with Object-centric Representations

Andrii Zadaianchuk*, Maximilian Seitzer*, Georg Martius

Max Planck Institute for Intelligent Systems and ETH Zürich

Abstract

Qualitative results on Rearrange and Visual Rearrange environments

SMORL with GT representations

2 objects

3 objects

4 objects

SMORL with SCALOR representations

1 object

2 objects

Andrii Zadaianchuk, Maximilian Seitzer, Georg Martius