RL^2: Fast Reinforcement Learning via Slow Reinforcement Learning