Two agents are coupled within the robot and control the robot together: the red and blue agent control three joints of the back and front leg, respectively.
Unlike mixed incensive and competitive settings, influencing peer learning does not help much in cooperative settings and Meta-MAPG performs similarly to Meta-PG.
Second, Meta-PG and Meta-MAPG outperform the other approaches of LOLA-DiCE and REINFORCE, achieving higher rewards when interacting with a new teammate.