Shixiang (Shane) Gu, 顾世翔

I am a Research Scientist at Google Brain, where I mainly work on problems in deep learning, reinforcement learning, robotics, and probabilistic machine learning. My recent research focuses on sample-efficient RL methods that could scale to solve difficult continuous control problems in the real-world, which have been covered by Google Research Blogpost and MIT Technology Review.

I completed PhD in Machine Learning at the University of Cambridge and the Max Planck Institute for Intelligent Systems in Tübingen, where I was co-supervised by Richard E. Turner, Zoubin Ghahramani, and Bernhard Schölkopf. During my PhD, I also collaborated closely with Sergey Levine at UC Berkeley/Google Brain and Timothy Lillicrap at DeepMind. I hold my B.ASc. in Engineering Science from the University of Toronto, where I did my thesis with Geoffrey Hinton in distributed training of neural networks using evolutionary algorithms. I also had great fun time working with Steve Mann, developing real-time HDR capture for wearable cameras/displays. I interned at Google Brain hosted by Ilya Sutskever and Vincent Vanhoucke. I also volunteered as a Lab Scientist at Creative Destruction Lab, one of the leading tech-startup incubators in Canada. My PhD was funded by Cambridge-Tübingen PhD Fellowship, NSERC and Google Focused Research Award.

I am a Japan-born Chinese Canadian, and I speak, read, and write in three languages. Having lived in Japan, China, Canada, the US, the UK, and Germany, I go under multiple names: Shane Gu, Shixiang Gu, 顾世翔, 顧世翔(ぐう せいしょう). My Chinese name means “world” (世) and “flying” (翔), and thus I “flew around the world”.

Research Interests

Deep Reinforcement Learning & Robotics

  • Sample efficiency
  • Stability
  • Scalability

Deep Probabilistic Models

  • Learning in discrete variable models
  • Scalable amortized inference for probabilistic programs
  • Feature learning in sequence models

Featured in:



Ofir Nachum, Shixiang Gu, Honglak Lee, Sergey Levine. "Near-Optimal Representation Learning for Hierarchical Reinforcement Learning". [Arxiv] [Videos]

George Tucker, Dieterich Lawson, Shixiang Gu, Chris J. Maddison. "Doubly Reparameterized Gradient Estimators for Monte Carlo Objectives". [Arxiv]



Ofir Nachum, Shixiang Gu, Honglak Lee, Sergey Levine. "Data-Efficient Hierarchical Reinforcement Learning". NIPS 2018. [Arxiv] [Videos]

George Tucker, Surya Bhupatiraju, Shixiang Gu, Richard E. Turner, Zoubin Ghahramani, Sergey Levine. "The Mirage of Action-Dependent Baselines in Reinforcement Learning". ICML 2018. [Arxiv]

Vitchyr Pong*, Shixiang Gu*, Murtaza Dalal, Sergey Levine. "Temporal Difference Models: Model-Free Deep RL for Model-Based Control". ICLR 2018. *equal contribution [Paper] [Arxiv]

Benjamin Eysenbach, Shixiang Gu, Julian Ibarz, Sergey Levine. "Leave no Trace: Learning to Reset for Safe and Autonomous Reinforcement Learning". ICLR 2018. [Paper] [Arxiv] [Videos]

Shixiang Gu, Timothy Lillicrap, Zoubin Ghahramani, Richard E. Turner, Bernhard Schölkopf, Sergey Levine. “Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning”. NIPS 2017. [Paper] [Arxiv] [Code]

Natasha Jaques, Shixiang Gu, Dzmitry Bahdanau, Jose Miguel Hernndez Lobato, Richard E. Turner, Douglas Eck. “Sequence Tutor: Conservative fine-tuning of sequence generation models with KL-control”. ICML 2017. [Paper] [Arxiv] [MIT Technology Review] [Video]

Shixiang Gu, Timothy Lillicrap, Zoubin Ghahramani, Richard E. Turner, Sergey Levine. “Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic”. ICLR 2017 [Oral, ~3%]. [Paper] [Code]

Eric Jang, Shixiang Gu, Ben Poole. “Categorical Reparametrization with Gumble-Softmax”. ICLR 2017. [Paper]

Shixiang Gu*, Ethan Holly*, Timothy Lillicrap, Sergey Levine. “Deep Reinforcement Learning for Robotic Manipulation with Asynchronous Off-Policy Updates”. ICRA 2017. *equal contribution [Paper] [Arxiv] [Google Blogpost] [MIT Technology Review] [ZDNet] [Video]

Shixiang Gu, Timothy Lillicrap, Ilya Sutskever, Sergey Levine. “Continuous Deep Q-Learning with Model-based Acceleration”. ICML 2016. [Paper] [Arxiv]

Shixiang Gu, Sergey Levine, Ilya Sutskever, Andriy Mnih. “MuProp: Unbiased Backpropagation for Stochastic Neural Networks”. ICLR 2016. [Paper] [Arxiv]

Shixiang Gu, Zoubin Ghahramani, Richard E. Turner. “Neural Adaptive Sequential Monte Carlo”. NIPS 2015. [Paper] [Arxiv][Supplementary]

Nilesh Tripuraneni*, Shixiang Gu*, Hong Ge, Zoubin Ghahramani. “Particle Gibbs for Infinite Hidden Markov Models”. NIPS 2015. *equal contribution [Paper] [Arxiv]

Steve Mann, Raymond Chun Hing Lo, Kalin Ovtcharov, Shixiang Gu, David Dai, Calvin Ngan, Tao Ai. “Realtime HDR (High Dynamic Range) Video for EyeTap Wearable Computers, FPGA-Based Seeing Aids, and GlassEyes”, IEEE CCECE 2012, Montreal, 2012 April 29 to May 2. 6 pages, to be indexed in IEEE Xplore. ACM SIGGRAPH 2012, Emerging Technologies Exhibition. [Paper] [BibTex] [Video]

Workshop Papers

Chen Wang, Xiangyu Chen, Zelin Ye, Jialu Wang, Ziruo Cai, Shixiang Gu, Cewu Lu. "TRL: Discriminative Hints for Scalable Reverse Curriculum Learning". NIPS 2017 Deep Reinforcement Learning Symposium. [Paper] [Videos]

Shixiang Gu, Luca Rigazio. “Toward Deep Neural Network Architectures Robust to Adversarial Examples”. ICLR 2015 Workshop. [Paper]

Invited Talks

  1. Shixiang Gu. “Sample-Efficient Deep RL for Robotics”. Vector Institute, Canada, 2017.
  2. Shixiang Gu. “Sample-Efficient Deep RL for Robotics”. Hosted by Xiaoou Tang and Xiaogang Wang. CUHK, Hong Kong, China, 2017.
  3. Shixiang Gu. “Sample-Efficient Deep RL for Robotics”. Hosted by Masashi Sugiyama and Matsuo Yutaka. University of Tokyo, Japan, 2016.
  4. Timothy Lillicrap, Shixiang Gu. “Deep RL methods in Robotics”. Reinforcement Learning Forum. Google, USA, 2016.
  5. Shixiang Gu. “Generalized Backprop, Neural Particle Filter, and Guided Q-Learning”. Hosted by Pieter Abbeel. UC Berkeley, USA, 2015.
  6. Shixiang Gu. “Algorithms for Training Deep Stochastic Neural Networks”. Hosted by Noah Goodman. Stanford University, USA, 2015.
  7. Shixiang Gu, Andrey Malinin. “Long Short-Term Memory Networks”. Machine Learning RCC. Cambridge University, UK, 2015.

Academic Activitites

  1. Program Committee for NIPS Deep Reinforcement Learning Symposium, 2017.
  2. Reviewer for International Conference on Robotics and Automation (ICRA), 2018.
  3. Reviewer for Neural Computation (NECO), 2018.
  4. Reviewer for Conference on Robot Learning (CoRL), 2017.
  5. Reviewer for AISTATS 2018.
  6. Reviewer for Neural Information Processing Systems (NIPS), 2017.
  7. Reviewer for International Conference on Machine Learning (ICML), 2017.
  8. Reviewer for International Conference on Learning Representations (ICLR), 2016, 2017.
  9. Reviewer for IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2017.
  10. Reviewer for IEEE Conference on Decision and Control (CDC), 2017.


Email: <my-initials>717 at <first-three-char-of-cambridge> dot ac dot uk

Mail: Office BE4-40, Cambridge University Engineering Department, Trumpington Street, Cambridge CB2 1PZ, UK