Deep Reinforcement Learning Symposium, NIPS 2017
Organizers
Contact: deeprl.symposium.nips2017@gmail.com
Important Dates
Friday November 3, 2017 : Paper submission deadline
Monday November 20, 2017: Paper acceptance notification
Thursday December 7, 2017: Symposium
Call for Papers
We invite you to submit papers that combine neural networks with reinforcement learning, which will be presented as talks or posters. The deadline is November 3rd (midnight PST), and decisions will be sent out on November 20th. Please submit papers through this link. Submissions should be in the NIPS 2017 format with a maximum of eight pages, not including references, and as many pages as needed in an appendix section (all in a single pdf). Reading appendix is up to the discretion of the reviewers. The review process is double-blind. Accepted submissions will be presented in the form of posters or contributed talks.
Abstract
Although the theory of reinforcement learning addresses an extremely general class of learning problems with a common mathematical formulation, its power has been limited by the need to develop task-specific feature representations. A paradigm shift is occurring as researchers figure out how to use deep neural networks as function approximators in reinforcement learning algorithms; this line of work has yielded remarkable empirical results in recent years. This workshop will bring together researchers working at the intersection of deep learning and reinforcement learning, and it will help researchers with expertise in one of these fields to learn about the other.
Schedule
- Tentative symposium schedule:
2:00 - 2:40 Invited Talk: David Silver (Google DeepMind), Mastering Games with Deep Reinforcement Learning
2:40 - 3:10 Contributed talks (3 x 10 minutes)
- Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor (presentation) Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, Sergey Levine
- Active Neural Localization (presentation) Devendra Singh Chaplot, Emilio Parisotto, Ruslan Salakhutdinov
- Natural Language Policy Search (presentation) Jacob Andreas, Dan Klein, Sergey Levine
3:10 - 3:50 Invited Talk: Joelle Pineau (presentation) (McGill), Reproducibility in Deep Reinforcement Learning and Beyond
3:50 - 4:10 Break
4:10 - 4:40 Contributed talks (3 x 10 minutes)
- Backpropagation through the Void: Optimizing control variates for black-box gradient estimation (presentation) Will Grathwohl, Dami Choi, Yuhuai Wu, Geoff Roeder, David Duvenaud
- Parameter Space Noise for Exploration (presentation) Matthias Plappert, Rein Houthooft, Prafulla Dhariwal, Szymon Sidor, Richard Y. Chen, Xi Chen, Tamim Asfour, Pieter Abbeel, Marcin Andrychowicz
- Time-Contrastive Networks: Self-Supervised Learning from Pixels (presentation) Pierre Sermanet, Corey Lynch, Yevgen Chebotar, Jasmine Hsu, Eric Jang, Stefan Schaal, Sergey Levine
4:40 - 5:20 Invited Talk: Ruslan Salakhutdinov (CMU), Neural Map: Structured Memory for Deep RL
5:20 - 5:50 Contributed talks (3 x 10 minutes)
- Variance Reduction for Policy Gradient with Action-Dependent Factorized Baselines (presentation) Cathy Wu*, Aravind Rajeswaran*, Yan Duan, Vikash Kumar, Alexandre M Bayen, Sham Kakade, Igor Mordatch, Pieter Abbeel
- Sample-efficient Policy Optimization with Stein Control Variate (presentation) Hao Liu, Yihao Feng, Yi Mao, Dengyong Zhou, Jian Peng, Qiang Liu
- One-Shot Visual Imitation Learning via Meta-Learning (presentation) Chelsea Finn*, Tianhe Yu*, Tianhao Zhang, Pieter Abbeel, Sergey Levine
5:50 - 7:10 Poster session / snacks
7:10 - 7:50 Invited Talk: Ben Van Roy (Stanford), Deep Exploration Via Randomized Value Functions
7:50 - 8:20 Contributed talks (3 x 10 minutes)
- StarCraft II: A New Challenge for Reinforcement Learning Oriol Vinyals, Timo Ewalds, Sergey Bartunov, Petko Georgiev, Alexander Sasha Vezhnevets, Michelle Yeo, Alireza Makhzani, Heinrich Kuttler, John Agapiou, Julian Schrittwieser, John Quan, Stephen Gaffney, Stig Petersen, Karen Simonyan, Tom Schaul, Hado van Hasselt, David Silver, Timothy Lillicrap, Kevin Calderone, Paul Keet, Anthony Brunasso, David Lawrence, Anders Ekermo, Jacob Repp, Rodney Tsing
- Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning (presentation) Anusha Nagabandi, Gregory Kahn, Ronald Fearing, Sergey Levine
- Overcoming Exploration in Reinforcement Learning with Demonstrations (presentation) Ashvin Nair, Bob McGrew, Marcin Andrychowicz, Wojciech Zaremba, Pieter Abbeel
8:20 - 9:00 Invited Talk: Michael Bowling (presentation) (Alberta), Artificial Intelligence Goes All-In
Contributed Papers
Accepted papers will be presented in a spotlight talk or a poster session.
- Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, Sergey Levine
- Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning Anusha Nagabandi, Gregory Kahn, Ronald Fearing, Sergey Levine
- Natural Language Policy Search Jacob Andreas, Dan Klein, Sergey Levine
- Sample-efficient Policy Optimization with Stein Control Variate Hao Liu, Yihao Feng, Yi Mao, Dengyong Zhou, Jian Peng, Qiang Liu
- Active Neural Localization Devendra Singh Chaplot, Emilio Parisotto, Ruslan Salakhutdinov
- Overcoming Exploration in Reinforcement Learning with Demonstrations Ashvin Nair, Bob McGrew, Marcin Andrychowicz, Wojciech Zaremba, Pieter Abbeel
- Backpropagation through the Void: Optimizing control variates for black-box gradient estimation Will Grathwohl, Dami Choi, Yuhuai Wu, Geoff Roeder, David Duvenaud
- Parameter Space Noise for Exploration Matthias Plappert, Rein Houthooft, Prafulla Dhariwal, Szymon Sidor, Richard Y. Chen, Xi Chen, Tamim Asfour, Pieter Abbeel, Marcin Andrychowicz
- StarCraft II: A New Challenge for Reinforcement Learning Oriol Vinyals, Timo Ewalds, Sergey Bartunov, Petko Georgiev, Alexander Sasha Vezhnevets, Michelle Yeo, Alireza Makhzani, Heinrich Kuttler, John Agapiou, Julian Schrittwieser, John Quan, Stephen Gaffney, Stig Petersen, Karen Simonyan, Tom Schaul, Hado van Hasselt, David Silver, Timothy Lillicrap, Kevin Calderone, Paul Keet, Anthony Brunasso, David Lawrence, Anders Ekermo, Jacob Repp, Rodney Tsing
- Time-Contrastive Networks: Self-Supervised Learning from Pixels Pierre Sermanet, Corey Lynch, Yevgen Chebotar, Jasmine Hsu, Eric Jang, Stefan Schaal, Sergey Levine
- Variance Reduction for Policy Gradient with Action-Dependent Factorized Baselines Cathy Wu*, Aravind Rajeswaran*, Yan Duan, Vikash Kumar, Alexandre M Bayen, Sham Kakade, Igor Mordatch, Pieter Abbeel
- One-Shot Visual Imitation Learning via Meta-Learning Chelsea Finn*, Tianhe Yu*, Tianhao Zhang, Pieter Abbeel, Sergey Levine
- Learning Robust Rewards with Adversarial Inverse Reinforcement Learning Justin Fu, Katie Luo, Sergey Levine
- Do deep reinforcement learning algorithms really learn to navigate? Shurjo Banerjee, Vikas Dhiman, Brent Griffin, Jason J. Corso
- Regret Minimization for Partially Observable Deep Reinforcement Learning Peter H. Jin, Sergey Levine, Kurt Keutzer
- Planning and Learning with Stochastic Action Sets Craig Boutilier, Alon Cohen, Amit Daniely, Avinatan Hassidim, Yishay Mansour, Ofer Meshi, Dale Schuurmans
- Simple Nearest Neighbor Policy Method for Continuous Control Tasks Elman Mansimov, Kyunghyun Cho
- Temporal Difference Models: Model-Free Deep RL for Model-Based Control Vitchyr Pong*, Shixiang Gu*, Murtaza Dalal, Sergey Levine
- Teacher-Student Curriculum Learning Tambet Matiisen, Avital Oliver, Taco Cohen, John Schulman
- Learning Skill Embeddings for Transferable Robot Skills Karol Hausman, Jost Tobias Springenberg, Ziyu Wang, Nicolas Heess, Martin Riedmiller
- Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations Aravind Rajeswaran, Vikash Kumar, Abhishek Gupta, John Schulman, Emanuel Todorov, Sergey Levine
- TD Learning with Constrained Gradients Ishan Durugkar, Peter Stone
- Model-Ensemble Trust-Region Policy Optimization Thanard Kurutach, Ignasi Clavera, Yan Duan, Aviv Tamar, and Pieter Abbeel
- UCB Exploration via Q-Ensembles Richard Y. Chen, Szymon Sidor, Pieter Abbeel, John Schulman
- Efficient Exploration through Bayesian Deep Q-Networks Kamyar Azizzadenesheli, Emma Brunskill, Animashree Anandkumar
- What game are we playing? Differentiably learning games from incomplete observations Chun Kai Ling, J. Zico Kolter, Fei Fang
- Curiosity Driven Exploration by Self-Supervised Prediction Deepak Pathak, Pulkit Agrawal, Alexei A. Efros, Trevor Darrell
- Off-Policy Option Extraction from Demonstrations Karan Goel, Emma Brunskill
- Ray RLlib: A Composable and Scalable Reinforcement Learning Library Eric Liang*, Richard Liaw*, Robert Nishihara, Philipp Moritz, Roy Fox, Joseph Gonzalez, Ken Goldberg, Ion Stoica
- Zero-Shot Visual Imitation Deepak Pathak*, Michael Luo*, Parsa M.*, Dian Chen, Fred Shentu, Evan Shelhamer, Alexei Efros, Trevor Darrell, Pulkit Agrawal
- SILC: Smoother Imitation with Lipschitz Costs Akshat Dave, Sapana Chaudhary, Balaraman Ravindran
- Automatic Goal Generation for Reinforcement Learning Agents David Held, Xinyang Geng, Carlos Florensa, Pieter Abbeel
- Divide and Conquer Reinforcement Learning Dibya Ghosh, Avi Singh, Aravind Rajeswaran, Vikash Kumar, Larry Yang, Sergey Levine
- TRL: Discriminative Hints for Scalable Reverse Curriculum Learning Chen Wang, Xiangyu Chen, Zelin Ye, Jialu Wang, Ziruo Cai, Shixiang Gu, Cewu Lu
- Neural Map: Structured Memory for Deep Reinforcement Learning Emilio Parisotto, Ruslan Salakhutdinov
- Learning To Route with Deep RL Asaf Valadarsky, Michael Schapira, Dafna Shahaf, Aviv Tamar
- Leave No Trace: Learning to Reset for Safe and Autonomous Reinforcement Learning Benjamin Eysenbach, Shixiang Gu, Julian Ibarz, Sergey Levine
- Self-Supervised Visual Planning with Temporal Skip Connections Frederik Ebert, Chelsea Finn, Alex X. Lee, Sergey Levine
- Recurrent Temporal Difference Pierre Thodoroff, Joelle Pineau, Doina Precup
- Learning Deep Neural Network Control Policies for Agile Off-Road Autonomous Driving Yunpeng Pan, Ching-An Cheng, Kamil Saigol, Keuntaek Lee, Xinyan Yan, Evangelos Theodorou, Byron Boots
- Convergence of Value Aggregation for Imitation Learning Ching-An Cheng, Byron Boots
- Investigating Human Priors for Playing Video Games Rachit Dubey, Pulkit Agarwal, Deepak Pathak, Alyosha Efros, Thomas L. Griffiths
- Building Generalizable Agents with a Realistic and Rich 3D Environment Yi Wu, Yuxin Wu, Georgia Gkioxari, Yuandong Tian
- Investigating Deep Reinforcement Learning for Grasping Objects with an Anthropomorphic Hand Mayur Mudigonda, Pulkit Agrawal, Michael Deweese, Jitendra Malik
- Hierarchical Actor-Critic Andrew Levy, Robert Platt, Kate Saenko
- Accelerated Methods in On-Policy Deep Reinforcement Learning Adam Stooke, Pieter Abbeel
- RAIL: Risk-Averse Imitation Learning Anirban Santara, Abhishek Naik, Balaraman Ravindran, Dipankar Das, Dheevatsa Mudigere, Sasikanth Avancha, Bharat Kaul
- Harnessing Adversarial Attacks to Improve Robustness of Deep Reinforcement Learning Anay Pattanaik, Shuijing Liu, Zhenyi Tang, Girish Chowdhary
- DeepLoco: Dynamic Locomotion Skills Using Hierarchical Deep Reinforcement Learning Xue Bin Peng, Glen Berseth, Cheng Xie, KangKang Yin, Michiel van de Panne
- Genetic Policy Optimization Tanmay Gangwani, Jian Peng
- Worm-level Control through Search-based Reinforcement Learning Mathias Lechner, Radu Grosu, Ramin M. Hasani
- Reinforcement Learning from Imperfect Demonstrations Huazhe (Harry) Xu, Yang Gao, Ji Lin, Fisher Yu, Sergey Levine, Trevor Darrell
- A Unified View of Entropy-Regularized Markov Decision Processes Gergely Neu, Vicenç Gómez, Anders Jonsson
- Efficient exploration with Double Uncertain Value Networks Thomas M. Moerland, Joost Broekens, Catholijn M. Jonker
- A Deep Q-Network for the Beer Game, an Approach to Solve Inventory Optimization Problems Afshin Oroojlooyjadid, MohammadReza Nazari, Lawrence Snyder, Martin Takac
- Predictive-State Decoders: Augmenting RNNs for Better Filtering, Imitation, and Reinforcement Learning Arun Venkatraman, Nicholas Rhinehart, Wen Sun, Martial Hebert, Byron Boots, Kris M. Kitani, J. Andrew Bagnell
- Progressive Reinforcement Learning with Distillation for Multi-Skilled Motion Control Glen Berseth, Cheng Xie, Paul Cernek, Michiel Van de Panne
- Transferring Agent Behaviors from Videos via Motion GANs Ashley D. Edwards, Charles L. Isbell, Jr.
- Action Branching Architectures for Deep Reinforcement Learning Arash Tavakoli, Fabio Pardo, Petar Kormushev
- One-Shot Reinforcement Learning for Robot Navigation with Interactive Replay Jake Bruce, Niko Suenderhauf, Piotr Mirowski, Raia Hadsell, Michael Milford
- Improved Learning in Evolution Strategies via Sparser Inter-Agent Network Topologies Dhaval Adjodah, Dan Calacci, Yan Leng, Peter Krafft, Esteban Moro, Alex Pentland
- End-to-End Model Predictive Control Brandon Amos, Shixiang Gu, J. Zico Kolter
- Bayesian Deep Q-Learning via Continuous-Time Flows Ruiyi Zhang, Changyou Chen
- Smoothed Dual Embedding Control Bo Dai, Albert Shaw, Lihong Li, Lin Xiao, Niao He, Jianshu Chen, Le Song
- Variational Autoencoding Learning of Options by Reinforcement Joshua Achiam, Harrison Edwards, Dario Amodei, Pieter Abbeel
- Importance Sampled Option-Critic for More Sample Efficient Reinforcement Learning Karan Goel, Emma Brunskill
- A Modular Differentiable Rigid Body Physics Engine Filipe de Avila Belbute-Peres, J Zico Kolter
- Model-Free Shared Autonomy through Deep Human-in-the-Loop Reinforcement Learning Siddharth Reddy, Anca Dragan, Sergey Levine
- Comparing Deep Reinforcement Learning and Evolutionary Methods in Continuous Control Shangtong Zhang, Osmar R. Zaiane
- Deterministic Policy Optimization by Combining Pathwise and Score Function Estimators for Discrete Action Spaces Daniel Levy, Stefano Ermon
- Determinantal SARSA: Toward deep reinforcement learning for collaborative agents Takayuki Osogami, Rudy Raymond
- Prediction Error-based Transfer in Q-Ensembles Rakesh R Menon, Balaraman Ravindran
- Online Reinforcement Learning with Applications in Customer Journey Optimization Mohammadreza Nazari, Afshin Oroojlooy, Mustafa Onur Kabul
- Learning to Compose Skills Himanshu Sahni, Saurabh Kumar, Farhan Tejani, Charles Isbell
- Information Maximizing Exploration with a Latent Dynamics Model Trevor Barron, Heni Ben Amor, Oliver Obst
- Time Limits in Reinforcement Learning Fabio Pardo, Arash Tavakoli, Vitaly Levdik, Petar Kormushev
- A Deeper Look at Experience Replay Shangtong Zhang, Richard S. Sutton
- Structured Exploration via Deep Hierarchical Coordination Stephan Zheng, Yisong Yue
- A Benchmarking Environment for Reinforcement Learning Based Task Oriented Dialogue Management Inigo Casanueva, Pawel Budzianowski, Pei-Hao Su, Nikola Mrksic, Tsung-Hsien Wen, Stefan Ultes, Lina Rojas-Barahona, Steve Young, Milica Gasic
- Latent forward model for Real-time Strategy game planning with incomplete information Yuandong Tian, Qucheng Gong
- Crossmodal Attentive Skill Learner Shayegan Omidshafiei, Dong-Ki Kim, Jason Pazis, Jonathan P. How
- Learning to Multi-Task by Active Sampling Sahil Sharma*, Ashutosh Kumar Jha*, Parikshit S Hegde, Balaraman Ravindran
- A linear time algorithm to compute the natural policy gradient Aravind Rajeswaran, Rahul Kidambi, Sham Kakade
Program Committee
- Rocky Duan
- Rein Houthooft
- Junhyuk Oh
- Marcin Andrychowicz
- Richard Y. Chen
- Thomas Degris
- Prafulla Dhariwal
- Chelsea Finn
- Jakob Foerster
- Arthur Guez
- Shixiang Gu
- Xiaoxiao Guo
- Jean Harb
- Matt Hausknecht
- Max Jaderberg
- Alex Lee
- Ryan Lowe
- Vlad Mnih
- Emilio Parisotto
- Janarthanan Rajendran
- Aravind Rajeswaran
- Tim Salimans
- Sainbayar Sukhbaatar
- Arthur Szlam
- Aviv Tamar
- Haoran Tang
- Yuandong Tian
- Alex Vezhnevets
- Markus Wulfmeier