Ron Vainshtein Zohar Rimon Shie Mannor Chen Tessler
Technion - Israel Institute of Technology
NVIDIA, Israel
Below, we show the behavior of Masked Mimic adapted to different downstream tasks using Task Tokens
The agent walkes in a randomly chosen direction
The agent walks in and faces randomly chosen directions
The agent reaches for a randomly placed goal with the right hand
The agent strikes a target at a random location
The agent maximizes the jump distance from a jump location
Below, we show the behavior of directly optimizing PPO for the downstream tasks
From left to right, gravity is set to: -14, -16, -18, -20
From left to right, ground friction is set to: -0.5, -0.4, -0.3, -0.2