Deep Learning for Action and Interaction, NIPS 2016, Area 3

Images (left to right): Pinto & Gupta ICRA '16, Blundell et al. '16, Chen et al. ICCV '15, Levine et al. ISER '16, Wang et al. '16

In conjunction with NIPS 2016, Barcelona.

Organizers: Chelsea Finn, Raia Hadsell, Dave Held, Sergey Levine, Percy Liang

Videos of the workshop are now available here.

This workshop is located in Area 3 of the Centre Convencions Internacional Barcelona.

Abstract

Deep learning systems that act in and interact with an environment must reason about how actions will change the world around them. The natural regime for such real-world decision problems involves supervision that is weak, delayed, or entirely absent, and the outputs are typically in the context of sequential decision processes, where each decision affects the next input. This regime poses a challenge for deep learning algorithms, which typically excel with: (1) large amounts of strongly supervised data and (2) a stationary distribution of independently observed inputs. The algorithmic tools for tackling these challenges have traditionally come from reinforcement learning, optimal control, and planning, and indeed the intersection of reinforcement learning and deep learning is currently an exciting and active research area. At the same time, deep learning methods for interactive decision-making domains have also been proposed in computer vision, robotics, and natural language processing, often using different tools and algorithmic formalisms from classical reinforcement learning, such as direct supervised learning, imitation learning, and model-based control. The aim of this workshop will be to bring together researchers across these disparate fields. The workshop program will focus on both the algorithmic and theoretical foundations of decision making and interaction with deep learning, and the practical challenges associated with bringing to bear deep learning methods in interactive settings, such as robotics, autonomous vehicles, and interactive agents.

Schedule

Saturday December 10

Morning Session 1

9:00 - 9:15: Introductions

9:15 - 9:40: Joelle Pineau: Deep learning models for natural language interaction

9:40 - 10:05: Honglak Lee: Learning Disentangled Representations with Action-Conditional Future Prediction

10:05 - 10:30: Chris Summerfield: How artificial and biological agents ride the subway

10:30 - 11:00: morning coffee break

Morning session 2:

11:00 - 11:25: Jianxiong Xiao: Bridging the gap between vision and robotics: Where are my labels?

11:25 - 11:35: Spotlight: Fereshteh Sadeghi, Collision Avoidance via Deep RL: Real Vision-Based Flight without a Single Real Image

11:35 - 11:45: Spotlight: Piotr Mirowski, Learning to Navigate in Complex Environments

11:45 - 12:00: morning poster session

12:00 - 14:00: lunch break

Afternoon session 1:

14:00 - 14:25: Abhinav Gupta: Scaling Self-supervision: From one task, one robot to multiple tasks and robots

14:25 - 14:35: Spotlight: Sebastian Höfer, Unsupervised Learning of State Representations for Multiple Tasks

14:35 - 14:45: Spotlight: Jacob Andreas, Modular Multitask Reinforcement Learning with Policy Sketches

14:45 - 15:00: afternoon poster session

15:00 - 15:30: afternoon coffee break (continuation of poster session)

Afternoon session 2:

15:30 - 15:55: Tim Lillicrap: Data-efficient deep reinforcement learning for continuous control

15:55 - 16:20: Raquel Urtasun: The role of perception for action

16:20 - 16:45: Jason Weston: Learning through Dialogue Interactions

16:45 - 16:55: Spotlight: Pararth Shah, Interactive reinforcement learning for task-oriented dialogue management

16:55 - 17:30: contributor “pitch” session

17:30 - 18:15: panel and audience discussion

Invited Talks

Joelle Pineau: Deep learning models for natural language interaction

This talk will review recent contributions by my team towards the problem of building neural models for dialogue agents. I will focus on generative models of dialogue, based on recurrent neural network architectures, and will present results from user studies using open vocabulary task-independent conversations. I will also present a scoring model trained to automatically evaluate dialogue agents, which can alleviate the need for expensive user studies, and show that this trained model can outperform other standard evaluation metrics for dialogue scoring.

Honglak Lee: Learning Disentangled Representations with Action-Conditional Future Prediction

Chris Summerfield: How artificial and biological agents ride the subway

Recent work in artificial intelligence and machine learning has made great strides towards building agents that behave intelligently in complex environments. For example, the Differentiable Neural Computer (DNC, Graves et al 2016) is a neural network with content-addressable external memory that can plan novel shortest-path trajectories random graphs, such as the London Underground system. In my talk, I will discuss this work in the context of studies of planning in humans. I will show evidence that humans plan by searching through hierarchically nested representations of the environment, describing behaviour and brain activity recorded as humans navigated a virtual subway environment.

Jianxiong Xiao: Bridging the gap between vision and robotics: Where are my labels?

Abhinav Gupta: Scaling Self-supervision: From one task, one robot to multiple tasks and robots

Tim Lillicrap: Data-efficient deep reinforcement learning for continuous control

Deep neural networks have recently been combined with reinforcement learning to solve problems such as playing Atari video games from just the raw pixels and rewards. Can the same basic approaches be applied in the context of robotics? One difference between these cases is that Atari games have only a small finite set of possible actions (e.g. up, down, jump, shoot). In robotics, the action selected by an agent at any given moment can be any of an infinite set of commands to move the joints in a large continuous space. I will describe work with model-free, off-policy algorithms that adapt insights from the discrete case and show successful learning for a variety of reaching, manipulation, and locomotion tasks in simulation. Further, I will demonstrate that these off-policy algorithms are data-efficient enough that they can learn a simple manipulation task from scratch with a 7 degree-of-freedom robot in the real world.

Raquel Urtasun: The role of perception for action

Jason Weston: Learning through Dialogue Interactions

A good dialogue agent should have the ability to interact with users. In this work, we explore this direction by designing a simulator and a set of synthetic tasks in the movie domain that allow the learner to interact with a teacher by both asking and answering questions. We investigate how a learner can benefit from asking questions in both an offline and online reinforcement learning setting. We demonstrate that the learner improves when asking questions. Our work represents a first step in developing end-to-end learned interactive dialogue agents.

This is joint work with Jiwei Li, Alexander H. Miller, Sumit Chopra and Marc'Aurelio Ranzato.

Call for Papers

We invite the submission of extended abstracts related to machine learning methods for domains involving taking actions and interacting with other agents, including, but not limited to, the following application areas:

robotics
autonomous driving
interactive language and dialog systems
active perception
navigation
game playing

Most accepted papers will be presented as posters, but a few selected contributions will be given oral presentations. Accepted papers will be posted in a non-archival format on the workshop website.

Abstracts should be 4 pages long (not including references) in NIPS format. Submissions may include a supplement, but reviewers are not required to read any supplementary material. Abstracts should be submitted by November 8th, 2016 by sending an email to nips2016interaction@gmail.com. Submissions may be anonymized or not, at the authors' discretion. Work that has already appeared in a journal, workshop, or conference (including NIPS 2016) must be significantly extended to be eligible for workshop submission. Work that is currently under review at another venue or has not yet been published in an archival format as of the date of the deadline (Nov 8th) may be submitted. This includes submissions to ICLR, which are welcome.

Important Dates

Submission Deadline: ~~Tuesday, November 8, 2016, any timezone~~

Acceptance Notification: ~~Tuesday, November 22, 2016~~

Workshop: Saturday, December 10th, 2016

Registration

Please refer to the NIPS 2016 website for registration details.

Accepted papers

Learning Invariant Feature Spaces to Transfer Skills with Reinforcement Learning

Abhishek Gupta, Coline Devin, YuXuan Liu, Pieter Abbeel, Sergey Levine

Modular Multitask Reinforcement Learning with Policy Sketches

Jacob Andreas, Dan Klein, and Sergey Levine

Collision Avoidance via Deep RL: Real Vision-Based Flight without a Single Real Image

Fereshteh Sadeghi, Sergey Levine

Reinforcement Learning with Few Expert Demonstrations

Aravind S. Lakshminarayanan, Sherjil Ozair, Yoshua Bengio

Interactive reinforcement learning for task-oriented dialogue management

Pararth Shah, Dilek Hakkani-Tür, Larry Heck

Learning Visual Servoing with Deep Features and Trust Region Fitted Q-Iteration

Alex X. Lee , Sergey Levine , Pieter Abbeel

Unsupervised Perceptual Rewards for Imitation Learning

Pierre Sermanet, Kelvin Xu, Sergey Levine

Decayed Markov Chain Monte Carlo for Interactive POMDP

Yanlin Han, Piotr Gmytrasiewicz

Learning to Query, Reason, and Answer Questions on Ambiguous Texts

Xiaoxiao Guo, Tim Klinger, Clemens Rosenbaum, Joseph P. Bigus, Murray Campbell, Ban Kawas, Kartik Talamadupula, Gerald Tesauro, Satinder Singh

Towards an end to end Dynamic Dialogue System

Vishal Bhalla

3D Simulation for Robot Arm Control with Deep Q-Learning

Stephen James, Edward Johns

In-Hand Robotic Manipulation via Deep Reinforcement Learning

Kapil D. Katyal, Edward W. Staley, Matthew S. Johannes, I-Jeng Wang, Austin Reiter, Phillipe Burlina

Learning to Drive using Inverse Reinforcement Learning and Deep Q-Networks

Sahand Sharifzadeh, Ioannis Chiotellis, Rudolph Triebel, Daniel Cremers

SAD-GAN: Synthetic Autonomous Driving using Generative Adversarial Networks

Arna Ghosh, Biswarup Bhattacharya, Somnath Basu Roy Chowdhury

Uncertainty-Aware Reinforcement Learning for Collision Avoidance

Gregory Kahn, Vitchyr Pong, Pieter Abbeel, Sergey Levine

Real-time Model-based Reinforcement Learning with Deep Function Approximation

Hussain Kazmi, Johan Driesen

Path Integral Guided Policy Search

Yevgen Chebotar, Mrinal Kalakrishnan, Ali Yahya, Adrian Li, Stefan Schaal, Sergey Levine

Shape-independent Hardness Estimation Using a GelSight Tactile Sensor

Wenzhen Yuan, Chenzhuo Zhu, Andrew Owens, Mandayam Srinivasan, Edward Adelson

Learning to Navigate in Complex Environments

Piotr Mirowski , Razvan Pascanu , Fabio Viola, Hubert Soyer, Andy Ballard, Andrea Banino, Misha Denil, Ross Goroshin, Laurent Sifre, Koray Kavukcuoglu, Dharshan Kumaran, Raia Hadsell

Unsupervised Learning of State Representations for Multiple Tasks

Antonin Raffin, Sebastian Höfer, Rico Jonschkowski, Oliver Brock, Freek Stulp

End-to-End Learnable Histogram Filters

Rico Jonschkowski, Oliver Brock

End to end active perception

Ilya Kostrikov, Dumitru Erhan, Sergey Levine

Combining Self-Supervised Learning and Imitation for Vision-Based Rope Manipulation

Ashvin Nair, Pulkit Agrawal, Dian Chen, Phillip Isola, Pieter Abbeel, Jitendra Malik, Sergey Levine

GuessWhat?! Visual Object Discovery Through Multi-Modal Dialogue

Harm de Vries, Florian Strub, Sarath Chandar, Olivier Pietquin, Hugo Larochelle, Aaron Courville

Contributor Pitch Session

During our workshop, attendees were invited to sign up for a 3-minute slot in the "pitch session," during which time they could present an interesting idea, a discussion point, late-breaking work, or some other point they wished to share with the group. The pitches that were presented are listed below:

Rico Jonschkowski - Combining Algorithms and Deep Learning

Denis Steckelmacher - Hierarchical RL in POMDPs with Options

Eric Danziger - Conditioning policies on tasks

Grady Williams (gradyrw@gmail.com) - Benchmarking Deep Control and Perception Algorithms with Aggressive Driving

Jay McClelland - [No title]