Shixiang (Shane) Gu, 顾世翔

I am a Research Scientist at Google Brain, where I mainly work on problems in deep learning, reinforcement learning, robotics, and probabilistic machine learning. My recent research focuses on scalable RL methods that could solve difficult continuous control problems in the real-world, which have been covered by Google Research Blogpost and MIT Technology Review.

I completed PhD in Machine Learning at the University of Cambridge and the Max Planck Institute for Intelligent Systems in Tübingen, where I was co-supervised by Richard E. Turner, Zoubin Ghahramani, and Bernhard Schölkopf. During my PhD, I also interned and collaborated closely with Sergey Levine/Ilya Sutskever at UC Berkeley/Google Brain and Timothy Lillicrap at DeepMind. I hold my B.ASc. in Engineering Science from the University of Toronto, where I did my thesis with Geoffrey Hinton in distributed training of neural networks using evolutionary algorithms. I also had great fun time working with Steve Mann, developing real-time HDR capture for wearable cameras/displays. I also volunteer as a Lab Scientist at Creative Destruction Lab, one of the leading tech-startup incubators in Canada. My PhD was funded by Cambridge-Tübingen PhD Fellowship, NSERC and Google Focused Research Award.

I am a Japan-born Chinese Canadian, and I speak, read, and write in three languages. Having lived in Japan, China, Canada, the US, the UK, and Germany, I go under multiple names: Shane Gu, Shixiang Gu, 顾世翔, 顧世翔(ぐう せいしょう).


  1. Sept 8, 2019: 2 Papers accepted at CORL 2019 in Osaka, Japan!
  2. Sept 3, 2019: 2 Papers accepted at NeurIPS 2019 in Vancouver, Canada!

Research Interests

Deep Reinforcement Learning & Robotics

  • Sample efficiency, stability, scalability
  • Unsupervised, self-supervised, continual learning
  • Dexterity and empowerment

Machine Learning & Deep Learning

  • Latent variable models
  • Sequence prediction

Featured in:


  1. Archit Sharma, Shixiang Gu, Sergey Levine, Vikash Kumar, Karol Hausman. "Dynamics-Aware Unsupervised Discovery of Skills". [Arxiv] [Videos]
  2. Natasha Jaques, Asma Ghandeharioun, Judy Hanwen Shen, Craig Ferguson, Agata Lapedriza, Noah Jones, Shixiang Gu, Rosalind Picard. "Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human Preferences in Dialog". [Arxiv]

Conference Publications

  1. Seyed Kamyar Seyed Ghasemipour, Richard Zemel, Shixiang Gu. "A Divergence Minimization Perspective on Imitation Learning Methods". CoRL 2019. [Arxiv soon]
  2. Ofir Nachum, Michael Ahn, Hugo Ponte, Shixiang Gu, Vikash Kumar. "Multi-Agent Manipulation via Locomotion using Hierarchical Sim2Real". CoRL 2019. [Arxiv soon]
  3. Seyed Kamyar Seyed Ghasemipour, Shixiang Gu, Richard Zemel. "SMILe: Scalable Meta Inverse Reinforcement Learning through Context-Conditional Policies". NeurIPS 2019. [Arxiv soon]
  4. Yiding Jiang, Shixiang Gu, Kevin Murphy, Chelsea Finn. "Language as an Abstraction for Hierarchical Deep Reinforcement Learning". NeurIPS 2019. [Arxiv] [Videos]
  5. Ofir Nachum, Shixiang Gu, Honglak Lee, Sergey Levine. "Near-Optimal Representation Learning for Hierarchical Reinforcement Learning". ICLR 2019. [Arxiv] [Videos]
  6. George Tucker, Dieterich Lawson, Shixiang Gu, Chris J. Maddison. "Doubly Reparameterized Gradient Estimators for Monte Carlo Objectives". ICLR 2019. [Arxiv]
  7. Ofir Nachum, Shixiang Gu, Honglak Lee, Sergey Levine. "Data-Efficient Hierarchical Reinforcement Learning". NeurIPS 2018. [Arxiv] [Videos]
  8. George Tucker, Surya Bhupatiraju, Shixiang Gu, Richard E. Turner, Zoubin Ghahramani, Sergey Levine. "The Mirage of Action-Dependent Baselines in Reinforcement Learning". ICML 2018. [Arxiv]
  9. Vitchyr Pong*, Shixiang Gu*, Murtaza Dalal, Sergey Levine. "Temporal Difference Models: Model-Free Deep RL for Model-Based Control". ICLR 2018. *equal contribution [Paper] [Arxiv]
  10. Benjamin Eysenbach, Shixiang Gu, Julian Ibarz, Sergey Levine. "Leave no Trace: Learning to Reset for Safe and Autonomous Reinforcement Learning". ICLR 2018. [Paper] [Arxiv] [Videos]
  11. Shixiang Gu, Timothy Lillicrap, Zoubin Ghahramani, Richard E. Turner, Bernhard Schölkopf, Sergey Levine. "Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning". NIPS 2017. [Paper]
  12. Natasha Jaques, Shixiang Gu, Dzmitry Bahdanau, Jose Miguel Hernndez Lobato, Richard E. Turner, Douglas Eck. "Sequence Tutor: Conservative fine-tuning of sequence generation models with KL-control". ICML 2017. [Paper]
  13. Shixiang Gu, Timothy Lillicrap, Zoubin Ghahramani, Richard E. Turner, Sergey Levine. "Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic". ICLR 2017 [Oral, ~3%]. [Paper]
  14. Eric Jang, Shixiang Gu, Ben Poole. "Categorical Reparametrization with Gumble-Softmax". ICLR 2017. [Paper]
  15. Shixiang Gu*, Ethan Holly*, Timothy Lillicrap, Sergey Levine. "Deep Reinforcement Learning for Robotic Manipulation with Asynchronous Off-Policy Updates". ICRA 2017. [Paper] [Google Blogpost] [MIT Technology Review] [ZDNet] [Video] *equal contribution
  16. Shixiang Gu, Timothy Lillicrap, Ilya Sutskever, Sergey Levine. "Continuous Deep Q-Learning with Model-based Acceleration". ICML 2016. [Paper] [Arxiv]
  17. Shixiang Gu, Sergey Levine, Ilya Sutskever, Andriy Mnih. "MuProp: Unbiased Backpropagation for Stochastic Neural Networks". ICLR 2016. [Paper]
  18. Shixiang Gu, Zoubin Ghahramani, Richard E. Turner. "Neural Adaptive Sequential Monte Carlo". NIPS 2015. [Paper] [Arxiv] [Supplementary]
  19. Nilesh Tripuraneni*, Shixiang Gu*, Hong Ge, Zoubin Ghahramani. "Particle Gibbs for Infinite Hidden Markov Models". NIPS 2015. [Paper] [Arxiv] *equal contribution
  20. Steve Mann, Raymond Chun Hing Lo, Kalin Ovtcharov, Shixiang Gu, David Dai, Calvin Ngan, Tao Ai. "Realtime HDR (High Dynamic Range) Video for EyeTap Wearable Computers, FPGA-Based Seeing Aids, and GlassEyes", IEEE CCECE 2012, Montreal, 2012 April 29 to May 2. 6 pages, to be indexed in IEEE Xplore. ACM SIGGRAPH 2012, Emerging Technologies Exhibition. [Paper] [BibTex] [Video]

Workshop Papers

  1. Abhishek Bhatia, Jaan Altosaar, Shixiang Gu. "Proximity-Constrained Reinforcement Learning". Advances in Approximate Bayesian Inference. NIPS 2019. [Paper]
  2. Brandon Amos, Shixiang Gu, J. Zico Kolter. "End-to-End Model-Predictive Control". Deep Reinforcement Learning Symposium, NIPS 2017. [Paper]
  3. Chen Wang, Xiangyu Chen, Zelin Ye, Jialu Wang, Ziruo Cai, Shixiang Gu, Cewu Lu. "TRL: Discriminative Hints for Scalable Reverse Curriculum Learning". Deep Reinforcement Learning Symposium, NIPS 2017. [Paper] [Videos]
  4. Vitchyr Pong, Shixiang Gu, Sergey Levine . "Learning Long-term Dependencies with Deep Memory States ". Lifelong Learning: A Reinforcement Learning Approach Workshop, ICML 2017. [Paper]
  5. Shixiang Gu, Luca Rigazio. “Toward Deep Neural Network Architectures Robust to Adversarial Examples”. ICLR 2015 Workshop. [Paper]

Invited Talks

  1. Shixiang Gu. “Sample-Efficient Deep RL for Robotics”. Vector Institute, Canada, 2017.
  2. Shixiang Gu. “Sample-Efficient Deep RL for Robotics”. Hosted by Xiaoou Tang and Xiaogang Wang. CUHK, Hong Kong, China, 2017.
  3. Shixiang Gu. “Sample-Efficient Deep RL for Robotics”. Hosted by Masashi Sugiyama and Matsuo Yutaka. University of Tokyo, Japan, 2016.
  4. Timothy Lillicrap, Shixiang Gu. “Deep RL methods in Robotics”. Reinforcement Learning Forum. Google, USA, 2016.
  5. Shixiang Gu. “Generalized Backprop, Neural Particle Filter, and Guided Q-Learning”. Hosted by Pieter Abbeel. UC Berkeley, USA, 2015.
  6. Shixiang Gu. “Algorithms for Training Deep Stochastic Neural Networks”. Hosted by Noah Goodman. Stanford University, USA, 2015.
  7. Shixiang Gu, Andrey Malinin. “Long Short-Term Memory Networks”. Machine Learning RCC. Cambridge University, UK, 2015.

Academic Activitites

  1. Program Committee for NIPS Deep Reinforcement Learning Symposium, 2017.
  2. Reviewer for International Conference on Robotics and Automation (ICRA), 2018.
  3. Reviewer for Neural Computation (NECO), 2018.
  4. Reviewer for Conference on Robot Learning (CoRL), 2017.
  5. Reviewer for AISTATS 2018.
  6. Reviewer for Neural Information Processing Systems (NIPS), 2017.
  7. Reviewer for International Conference on Machine Learning (ICML), 2017.
  8. Reviewer for International Conference on Learning Representations (ICLR), 2016, 2017.
  9. Reviewer for IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2017.
  10. Reviewer for IEEE Conference on Decision and Control (CDC), 2017.


Email: <my-english-first-name-><my-last-name> at <the-company-i-work-at> dot com