Shixiang Shane Gu
Google Scholar Twitter Profile
Shixiang Shane Gu is a Research Scientist in Google DeepMind. Previously, he was a researcher in the ChatGPT team at OpenAI and an ex-Research Scientist at Google Research, Brain Team and a Visiting Associate Professor (Adjunct Professor) at the University of Tokyo, researching deep learning, reinforcement learning, probabilistic machine learning, and robotics. Shane holds PhD in Machine Learning from the University of Cambridge and the Max Planck Institute for Intelligent Systems, supervised by Richard E. Turner, Zoubin Ghahramani, and Bernhard Schölkopf. Shane holds B.ASc. in Engineering Science from the University of Toronto, supervised by the thesis advisor Geoffrey E. Hinton. Shane previously was also a visiting scholar at the Department of Computer Science at Stanford University hosted by Emma Brunskill. Shane's academic work received Best Paper Award at CoRL 2019, Google Focused Research Award, Cambridge-Tübingen PhD Fellowship, and NSERC Scholarship, and was featured in Google Research Blogpost and MIT Technology Review. Shane is a Japan-born Chinese Canadian, and he speaks, reads, and writes in three languages.
News
Sept 16, 2022: 2 Papers accepted at NeurIPS 2022!
May 14, 2022: 2 Papers accepted at ICML 2022!
Jan 30, 2022: 1 Paper accepted as a Spotlight (top ~5% of 3391 submissions) at ICLR 2022!
Nov, 30, 2021: 3 computer vision papers (VaxNeRF, Tool-As-Embodiment, Amortized Prompt) posted on arxiv!
Oct 2, 2021: 2 Papers accepted at NeurIPS 2021! (1 Spotlight paper, top 3% of 9122 submissions)
May 2021: 3 Papers accepted at ICML 2021!
Jan 13, 2020: 1 Paper accepted at ICLR 2021!
Nov 1, 2020: 1 Paper accepted at NeurIPS 2020!
Sept 15, 2020: 1 Paper accepted at EMNLP 2020!
May 1, 2020: 1 Paper accepted at RSS 2020!
Dec 19, 2019: 1 Paper accepted as Oral at ICLR 2020!
Dec 3, 2019: My talk listed as one of "30 Influential AI Presentations from 2019" by Re:Work
Nov 1, 2019: Our paper on analyzing imitation learning algorithms received Best Paper Award at CoRL 2019!
Sept 8, 2019: 2 Papers accepted as Orals at CoRL 2019 in Osaka, Japan!
Sept 3, 2019: 2 Papers accepted at NeurIPS 2019 in Vancouver, Canada!
Pre-prints
Machel Reid, Yutaro Yamada, Shixiang Shane Gu. "Can Wikipedia Help Offline Reinforcement Learning?". [Arxiv] [Github]
Naruya Kondo, Yuya Ikeda, Andrea Tagliasacchi, Yutaka Matsuo, Yoichi Ochiai, Shixiang Shane Gu. "VaxNeRF: Revisiting the Classic for Voxel-Accelerated Neural Radiance Field". [Arxiv] [Github]
Yuki Noguchi, Tatsuya Matsushima, Yutaka Matsuo, Shixiang Shane Gu. "Tool as Embodiment for Recursive Manipulation". [Arxiv] [Project Page] [Github]
Xin Zhang, Yusuke Iwasawa, Yutaka Matsuo, Shixiang Shane Gu. "Amortized Prompt: Lightweight Fine-Tuning for CLIP in Domain Generalization". [Arxiv]
Shixiang Shane Gu, Manfred Diaz, C. Daniel Freeman, Hiroki Furuta, Seyed Kamyar Seyed Ghasemipour, Anton Raichuk, Byron David, Erik Frey, Erwin Coumans, Olivier Bachem. "Braxlines: Fast and Interactive Toolkit for RL-driven Behavior Generation Beyond Reward Maximization". [Arxiv] [Github]
Conference Publications
Ruibo Liu, Jason Wei, Shixiang Shane Gu, Te-Yen Wu, Soroush Vosoughi, Claire Cui, Denny Zhou, Andrew M. Dai. "Mind's Eye: Grounded Language Model Reasoning through Simulation". ICLR 2023. [Openreview]
Takeshi Kojima, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, Yusuke Iwasawa. "Large Language Models are Zero-Shot Reasoners". NeurIPS 2022. [Arxiv]
Seyed Kamyar Seyed Ghasemipour, Shixiang Shane Gu, Ofir Nachum. "Why So Pessimistic? Estimating Uncertainties for Offline RL through Ensembles, and Why Their Independence Matters." NeurIPS 2022. [Arxiv]
Scott Fujimoto, David Meger, Doina Precup, Ofir Nachum, Shixiang Shane Gu. "Why Should I Trust You, Bellman? The Bellman Error is a Poor Replacement for Value Error". ICML 2022. [Arxiv]
Seyed Kamyar Seyed Ghasemipour, Daniel Freeman, Byron David, Shixiang Shane Gu, Satoshi Kataoka, Igor Mordatch. "Blocks Assemble! Learning to Assemble with Large-Scale Structured Reinforcement Learning". ICML 2022. [Arxiv]
Hiroki Furuta, Yutaka Matsuo, Shixiang Shane Gu. "Generalized Decision Transformer for Offline Hindsight Information Matching". ICLR 2022 [Spotlight, 5.2% of 3391 submissions]. [Arxiv] [Project Page] [Github]
Scott Fujimoto, Shixiang Shane Gu. "A Minimalist Approach to Offline Reinforcement Learning". NeurIPS 2021 [Spotlight, 3% of 9122 submissions]. [Arxiv] [Code]
Hiroki Furuta, Tadashi Kozuno, Tatsuya Matsushima, Yutaka Matsuo, Shixiang Shane Gu. "Co-Adaptation of Algorithmic and Implementational Innovations in Inference-based Deep Reinforcement Learning". NeurIPS 2021. [Arxiv]
Hiroki Furuta, Tatsuya Matsushima, Tadashi Kozuno, Yutaka Matsuo, Sergey Levine, Ofir Nachum, Shixiang Shane Gu. "Policy Information Capacity: Information-Theoretic Measure for Task Complexity in Deep Reinforcement Learning". ICML 2021 (also a contributed talk at ICLR 2021 Workshop on Never-Ending RL). [Arxiv]
Jongwook Choi, Archit Sharma, Honglak Lee, Sergey Levine, Shixiang Shane Gu. "Variational Empowerment as Representation Learning for Goal-Based Reinforcement Learning". ICML 2021. [Arxiv]
Seyed Kamyar Seyed Ghasemipour, Dale Schuurmans, Shixiang Shane Gu. "EMaQ: Expected-Max Q-Learning Operator for Simple Yet Effective Offline and Online RL". ICML 2021. [Arxiv]
Tatsuya Matsushima*, Hiroki Furuta*, Yutaka Matsuo, Ofir Nachum, Shixiang Gu. "Deployment-Efficient Reinforcement Learning via Model-Based Offline Optimization". ICLR 2021. *equal contribution [Arxiv]
Lisa Lee, Benjamin Eysenbach, Ruslan Salakhutdinov, Shixiang Shane Gu, Chelsea Finn. "Weakly-Supervised Reinforcement Learning for Controllable Behavior". NeurIPS 2020. [Arxiv]
Natasha Jaques, Asma Ghandeharioun, Judy Hanwen Shen, Craig Ferguson, Agata Lapedriza, Noah Jones, Shixiang Shane Gu, Rosalind Picard. "Human-centric dialog training via offline reinforcement learning". EMNLP 2020. [Arxiv]
Archit Sharma, Michael Ahn, Sergey Levine, Vikash Kumar, Karol Hausman, Shixiang Gu. "Emergent Real-World Robotic Skills via Unsupervised Off-Policy Reinforcement Learning". RSS 2020. [Arxiv] [Videos]
Archit Sharma, Shixiang Gu, Sergey Levine, Vikash Kumar, Karol Hausman. "Dynamics-Aware Unsupervised Discovery of Skills". ICLR 2020 [Oral, 1.8% of 2594 submissions]. [Arxiv] [Videos]
Seyed Kamyar Seyed Ghasemipour, Richard Zemel, Shixiang Gu. "A Divergence Minimization Perspective on Imitation Learning Methods". CoRL 2019 [Best Paper Award, 0.25% of 398 submissions]. [Arxiv] [Videos]
Ofir Nachum, Michael Ahn, Hugo Ponte, Shixiang Gu, Vikash Kumar. "Multi-Agent Manipulation via Locomotion using Hierarchical Sim2Real". CoRL 2019 [Oral]. [Arxiv]
Seyed Kamyar Seyed Ghasemipour, Shixiang Gu, Richard Zemel. "SMILe: Scalable Meta Inverse Reinforcement Learning through Context-Conditional Policies". NeurIPS 2019. [PDF]
Yiding Jiang, Shixiang Gu, Kevin Murphy, Chelsea Finn. "Language as an Abstraction for Hierarchical Deep Reinforcement Learning". NeurIPS 2019. [Arxiv] [Videos]
Ofir Nachum, Shixiang Gu, Honglak Lee, Sergey Levine. "Near-Optimal Representation Learning for Hierarchical Reinforcement Learning". ICLR 2019. [Arxiv] [Videos]
George Tucker, Dieterich Lawson, Shixiang Gu, Chris J. Maddison. "Doubly Reparameterized Gradient Estimators for Monte Carlo Objectives". ICLR 2019. [Arxiv]
Ofir Nachum, Shixiang Gu, Honglak Lee, Sergey Levine. "Data-Efficient Hierarchical Reinforcement Learning". NeurIPS 2018. [Arxiv] [Videos]
George Tucker, Surya Bhupatiraju, Shixiang Gu, Richard E. Turner, Zoubin Ghahramani, Sergey Levine. "The Mirage of Action-Dependent Baselines in Reinforcement Learning". ICML 2018. [Arxiv]
Vitchyr Pong*, Shixiang Gu*, Murtaza Dalal, Sergey Levine. "Temporal Difference Models: Model-Free Deep RL for Model-Based Control". ICLR 2018. *equal contribution [Paper] [Arxiv] [Blogpost]
Benjamin Eysenbach, Shixiang Gu, Julian Ibarz, Sergey Levine. "Leave no Trace: Learning to Reset for Safe and Autonomous Reinforcement Learning". ICLR 2018. [Paper] [Arxiv] [Videos]
Shixiang Gu, Timothy Lillicrap, Zoubin Ghahramani, Richard E. Turner, Bernhard Schölkopf, Sergey Levine. "Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning". NIPS 2017. [Paper]
Natasha Jaques, Shixiang Gu, Dzmitry Bahdanau, Jose Miguel Hernndez Lobato, Richard E. Turner, Douglas Eck. "Sequence Tutor: Conservative fine-tuning of sequence generation models with KL-control". ICML 2017. [Paper]
Shixiang Gu, Timothy Lillicrap, Zoubin Ghahramani, Richard E. Turner, Sergey Levine. "Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic". ICLR 2017 [Oral, ~3%]. [Paper]
Eric Jang, Shixiang Gu, Ben Poole. "Categorical Reparametrization with Gumble-Softmax". ICLR 2017. [Paper]
Shixiang Gu*, Ethan Holly*, Timothy Lillicrap, Sergey Levine. "Deep Reinforcement Learning for Robotic Manipulation with Asynchronous Off-Policy Updates". ICRA 2017. [Paper] [Google Blogpost] [MIT Technology Review] [ZDNet] [Video] *equal contribution
Shixiang Gu, Timothy Lillicrap, Ilya Sutskever, Sergey Levine. "Continuous Deep Q-Learning with Model-based Acceleration". ICML 2016. [Paper] [Arxiv]
Shixiang Gu, Sergey Levine, Ilya Sutskever, Andriy Mnih. "MuProp: Unbiased Backpropagation for Stochastic Neural Networks". ICLR 2016. [Paper]
Shixiang Gu, Zoubin Ghahramani, Richard E. Turner. "Neural Adaptive Sequential Monte Carlo". NIPS 2015. [Paper] [Arxiv] [Supplementary]
Nilesh Tripuraneni*, Shixiang Gu*, Hong Ge, Zoubin Ghahramani. "Particle Gibbs for Infinite Hidden Markov Models". NIPS 2015. [Paper] [Arxiv] *equal contribution
Steve Mann, Raymond Chun Hing Lo, Kalin Ovtcharov, Shixiang Gu, David Dai, Calvin Ngan, Tao Ai. "Realtime HDR (High Dynamic Range) Video for EyeTap Wearable Computers, FPGA-Based Seeing Aids, and GlassEyes", IEEE CCECE 2012, Montreal, 2012 April 29 to May 2. 6 pages, to be indexed in IEEE Xplore. ACM SIGGRAPH 2012, Emerging Technologies Exhibition. [Paper] [BibTex] [Video]
Workshop Papers
Ofir Nachum, Haoran Tang, Xingyu Lu, Shixiang Gu, Honglak Lee, Sergey Levine. "Why Does Hierarchy (Sometimes) Work So Well in Reinforcement Learning?". [Arxiv]
Abhishek Bhatia, Jaan Altosaar, Shixiang Gu. "Proximity-Constrained Reinforcement Learning". Advances in Approximate Bayesian Inference. NIPS 2017. [Paper]
Brandon Amos, Shixiang Gu, J. Zico Kolter. "End-to-End Model-Predictive Control". Deep Reinforcement Learning Symposium, NIPS 2017. [Paper]
Chen Wang, Xiangyu Chen, Zelin Ye, Jialu Wang, Ziruo Cai, Shixiang Gu, Cewu Lu. "TRL: Discriminative Hints for Scalable Reverse Curriculum Learning". Deep Reinforcement Learning Symposium, NIPS 2017. [Paper] [Videos]
Vitchyr Pong, Shixiang Gu, Sergey Levine . "Learning Long-term Dependencies with Deep Memory States ". Lifelong Learning: A Reinforcement Learning Approach Workshop, ICML 2017. [Paper]
Shixiang Gu, Luca Rigazio. “Toward Deep Neural Network Architectures Robust to Adversarial Examples”. ICLR 2015 Workshop. [Paper]
Invited Talks
Shixiang Shane Gu. "Continual Learning Machine" SEMICON Japan 2021 (26,626 in-person attendees).
Shixiang Gu. "Mutual Information in Deep RL: Toward a Single Tractable Reward Function for General Intelligence". Qingyuan Salon (青源沙龙) by Beijing Academy of Artificial Intelligence (BAAI), 2021.
Shixiang Gu. "Mutual Information in Deep RL: Toward a Single Tractable Reward Function for General Intelligence". Computational Sensorimotor Learning (CSL) Seminar at MIT, 2021.
Shixiang Gu. "The Future of Robotics". TKS (non-technical talk for 2000+ high school students), 2021. [Video, Website]
Shixiang Gu. "Empowerment as an Intelligence Measure". Re:Work Deep Learning Summit 2020. [Video]
Shixiang Gu. "Intelligence, Robotics, and Predictability Maximization". University of Tokyo, Japan, 2019. [Slides]
Shixiang Gu. "Model-based RL with Predictability Maximization". RIKEN AIP, Japan, 2019; University of Kyoto, Japan, 2019. [Slides]
Shixiang Gu. "Rewards, Resets, Exploration: Bottlenecks in Scaling Deep RL for Robotics". Re:Work Deep RL Summit, San Francisco, USA, 2019. [Slides] "30 Influential AI Presentations from 2019" by Re:Work
Shixiang Gu. “Sample-Efficient Deep RL for Robotics”. Vector Institute, Canada, 2017; CUHK, Hong Kong, China, 2017 (hosted by Xiaoou Tang and Xiaogang Wang); University of Tokyo, Japan, 2016 (hosted by Masashi Sugiyama and Matsuo Yutaka).
Timothy Lillicrap, Shixiang Gu. “Deep RL methods in Robotics”. Reinforcement Learning Forum. Google, USA, 2016.
Shixiang Gu. “Generalized Backprop, Neural Particle Filter, and Guided Q-Learning”. Hosted by Pieter Abbeel. UC Berkeley, USA, 2015.
Shixiang Gu. “Algorithms for Training Deep Stochastic Neural Networks”. Hosted by Noah Goodman. Stanford University, USA, 2015.
Shixiang Gu, Andrey Malinin. “Long Short-Term Memory Networks”. Machine Learning RCC. Cambridge University, UK, 2015.
Academic Activitites
Action Editor for Transactions on Machine Learning Research (TMLR).
Area Chair for ICML 2023.
Area Chair for NeurIPS 2021, 2022.
Co-organizer for Ecological Theory of RL Workshop at NeurIPS 2021.
Co-organizer for Exploration in RL Workshop at ICML 2018 and 2019.
Program Committee for ICLR A Roadmap To Never-ending RL workshop 2021.
Senior Program Committee for IJCAI 2021.
Expert Reviewer (ER) for International Conference on Machine Learning (ICML) 2021.
Reviewer for JMLR, 2022.
Program Committee for Learning for Dynamics and Control (L4DC), 2020, 2021.
Program Committee for NeurIPS Deep Reinforcement Learning Symposium, 2017, 2018, 2019.
Reviewer for Nature, 2020.
Reviewer for International Conference on Robotics and Automation (ICRA), 2018, 2022.
Reviewer for Neural Computation (NECO), 2018.
Reviewer for Conference on Robot Learning (CoRL), 2017, 2018, 2019.
Reviewer for AISTATS 2018.
Reviewer for Neural Information Processing Systems (NeurIPS), 2017, 2018, 2019, 2020.
Reviewer for International Conference on Machine Learning (ICML), 2017, 2018, 2019, 2020.
Reviewer for International Conference on Learning Representations (ICLR), 2016, 2017, 2018, 2019.
Reviewer for IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2017.
Reviewer for IEEE Conference on Decision and Control (CDC), 2017.
Contact
Email: <my-english-first-name-><my-last-name> at <the-company-i-work-at> dot com