Videos

The following videos depict the trained MB-MPO policies in action. Also the videos include a side-to-side comparison with converged TRPO policies. The embedded clips of the running creatures were slowed down by factor 5 compared to real time in order to give the viewers more time to analyze the locomotion behavior. In case of the simulated PR2 robot, the clips were slowed down by factor 2.