The Mirage of Action Dependent Baselines in Reinforcement Learning

George Tucker (gjt@google.com), Surya Bhupatiraju, Shixiang Gu, Richard E. Turner, Zoubin Ghahramani, Sergey Levine