Regularity as Intrinsic Reward for Free Play

Cansu Sancaktar, Justus Piater and Georg Martius 

RaIR: Regularity as Intrinsic Reward

Optimizing for RaIR with Ground Truth (GT) models gives us regular and symmetric patterns/constellations!

RaIR + CEE-US: We inject RaIR into free play together with ensemble disagreement.

Free Play with RaIR + CEE-US in Construction Environment

Some snapshots from free play

Iteration 271

Iteration 285

Iteration 292

Iteration 295

Patterns generated by optimizing for RaIR including color as a symbol in ShapeGridworld

Poses generated by optimizing for RaIR in locomotion environments with GT models

Quadruped

Walker

Free Play in Quadruped with RaIR + CEE-US

Free play iteration 110

Free play iteration 150

Free play iteration 165

Free play iteration 172

Examples for Zero-shot Downstream Task Performance 

(Videos are RaIR + CEE-US solving the tasks)

Balance Front

Stand Rotated

Attack

Balance Back

Free Play in Custom Construction (diverse shapes) with RaIR + CEE-US

Free play iteration 150

Free play iteration 175

Free play iteration 180

Free play iteration 190

Examples for Zero-shot Downstream Task Performance 

(Videos are RaIR + CEE-US solving the tasks)

Stack Cube + Ball

Stack Cube + Flat Block