g is a transformation, f maps world states w to state representation z, and ·w/·z are the operator with which you apply g to w or z (also called group action).
Left: Environment studied in this paper. Right: Proposed architecture for joint learning of the group action and state representation.
Left: First option: decoupled learning of SB-disentanglement. Latent traversal spanning from -2 to 2 over each of the representation’s dimensions, followed by the predicted effect of the group action associated each action (left, right, down, up). Right: Second option: joint learning of SB-disentanglement. The representation is complex: latent traversal over the phase of each of the representation’s dimensions, followed by the predicted effect of the group action associated each action (down, left, up, right).
Mean 10-fold cross validation accuracy as functions of dataset size and classifier capacity (max depth parameter of Random Forest).