Policy optimization in a noisy neighborhood: 

on return Landscapes in continuous control