Cartpole Balance Task Evaluation Environment Env2
The non-robust agent is unable to balance the pole.
Cartpole Balance Task Evaluation Environment Env2
The robust agent is able to successfully balance a pole length it has never seen before
Walker Walk Task Evaluation Environment Env2
This agent is unstable and although it succeeds in achieving a gate movement, it very quickly falls to the ground.
Walker Walk Task Evaluation Environment Env2
Note that the agent learns to drag its leg due to the change in quadracep length. This prevents the agent from falling, is very stable and results in the improved performance compared to the non-robust agent.
Here, the agent struggles to stand up and displays a similar behaviour, albeit slightly more robust, to that of E-MPO. However, it learns a significantly different policy to that of RE-MPO
Cheetah Task Evaluation Environment Env2
The Cheetah learns an aggressive and unstable running policy which causes it to fall to the ground
Cheetah Task Evaluation Environment Env2
The Cheetah learns a running policy that prevents it from falling over
Shadowhand Orientation Task Evaluation Environment Env2
The Shadowhand attempts to orient the cube, but does not know how to manipulate a cube that is smaller than the one on which it was trained
Shadowhand Orientation Task Evaluation Environment Env2
The Shadowhand manages to orient the cube into the correct position