NeurIPS19 paper summary

Symmetry-Based Disentangled Representation Learning requires Interaction with Environments

H. Caselles-Dupré, M. Garcia-Ortiz, D. Filliat.

Flowers Laboratory (INRIA & ENSTA Paris), Softbank Robotics Europe.

Context

Recent papers [1,2] aim at finding a generally accepted formal definition of a disentangled representation for an agent behaving in an environment.

Higgins et al. proposed Symmetry-Based Disentangled Representation Learning [1], a definition based on a characterization of symmetries in the environment using group theory. They focus on transformations that change some properties of the underlying world state, while leaving all other properties invariant.

We study how to learn Symmetry-Based disentangled representations in practice.

Contributions:

- Theoretical and empirical arguments that proves SBDRL cannot only be based on fixed data samples.

- Guidelines on how to learn SB-disentangled representations in practice.

Definition of Symmetry-Based Disentangled (SBD) representations

SBD-representations focus on transformations changing properties of the underlying world state, while leaving all other invariant.

A SB-representation satisfies this key identity.

g is a transformation, f maps world states w to state representation z, and ·w/·z are the operator with which you apply g to w or z (also called group action).

This identity tells that the learned representation should be equivariant between the effect of a transformation on world state or on the state representation.

Using this, Higgins et al. defines disentanglement as a decomposition of the latent space Z =(Z_1 x .. x Z_n) such that each Z_i is fixed by the action of all transformations but one which characterize the subspace.

Hence, a SBD-representation is both disentangled and a SB-representation.

Main result

Our main result, Theorem 1, proves that interaction with environments is necessary to learn SBD-representations (see paper for formal theorems).

Theorem 1: Worlds with different physics can produce the same training set of still observations. It is thus impossible to reliably learn the symmetries effect on the world using only still observations.

Experiments

Using transitions, how can one learn a SBD-representation in practice?

We provide practical guidelines for how to learn SBD-representations on a simple environment. There are two options: decoupled or joint learning of the group action and state representation. Both options uses transitions instead of still images (Theorem 1).

Left: Environment studied in this paper. Right: Proposed architecture for joint learning of the group action and state representation.

Results

Both approaches are viable for learning SBD-representations.

Left: First option: decoupled learning of SB-disentanglement. Latent traversal spanning from -2 to 2 over each of the representation’s dimensions, followed by the predicted effect of the group action associated each action (left, right, down, up). Right: Second option: joint learning of SB-disentanglement. The representation is complex: latent traversal over the phase of each of the representation’s dimensions, followed by the predicted effect of the group action associated each action (down, left, up, right).

Usefulness of learned representations

Are SBD-representations useful in practice?

We provide an example of how SB-representations can be useful for learning downstream task.

In the case of learning an inverse model, SBD-representations are more efficient for solving the task.

Mean 10-fold cross validation accuracy as functions of dataset size and classifier capacity (max depth parameter of Random Forest).

Conclusion

SBD-representations are a promising state representation learning paradigm, and should be learned using transitions rather than still samples.

References

[1] : Towards a Definition of Disentangled Representations (I. Higgins et al.)

[2] : Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations (F. Locatello et al.)