Variance Reduction for Reinforcement Learning in Input-Driven Environments
Walker2d with random wind
Walker2d with random wind
TRPO with meta baseline
TRPO with meta baseline
TRPO with 10 value networks
TRPO with 10 value networks
TRPO with standard value network
TRPO with standard value network
HalfCheetah on floating tiles
HalfCheetah on floating tiles
TRPO with meta baseline
TRPO with meta baseline
TRPO with 10 value networks
TRPO with 10 value networks
TRPO with standard value network
TRPO with standard value network
7-DoF arm tracking moving object
7-DoF arm tracking moving object
TRPO with meta baseline
TRPO with meta baseline
TRPO with 10 value networks
TRPO with 10 value networks
TRPO with standard value network
TRPO with standard value network