Human in the Loop Dialogue Systems

Note, since this year's NeurIPS workshops goes virtual, we only host live QA session (and a panel session) on the workshop day. Therefore, in order to get the most out of the workshop, it is recommended that you watch all the prerecorded talks before the workshop day. We have put Reserved blocks of time as an opportunity to watch the pre-recorded talks before the Q/A. All times below are PST

If you have a ticket to NeurIPS you can access all the talks here


Conversational interaction systems such as Amazon Alexa, Google Assistant, Apple Siri, and Microsoft Cortana have become very popular over the recent years. Such systems have allowed users to interact with a wide variety of content on the web through a conversational interface. Research challenges such as the Dialogue System Technology Challenges, Dialogue Dodecathlon, Amazon Alexa Prize and the Vision and Language Navigation task have continued to inspire research in conversational AI. These challenges have brought together researchers from different communities such as speech recognition, spoken language understanding, reinforcement learning, language generation, and multi-modal question answering.

Unlike other popular NLP tasks, dialogue frequently has humans in the loop, whether it is for evaluation, active learning or online reward estimation. Through this workshop we aim to bring together researchers from academia and industry to discuss the challenges and opportunities in such human in the loop setups. We hope that this sparks interesting discussions about conversational agents, interactive systems, and how we can use humans most effectively when building such setups. We will highlight areas such as human evaluation setups, reliability in human evaluation, human in the loop training, interactive learning and user modeling. We also highly encourage non-English based dialogue systems in these areas.

The one-day workshop will include talks from senior technical leaders and researchers to share insights associated with evaluating dialogue systems. We also plan on having oral presentations and poster sessions on works related to the topic of the workshop. Finally we will end the workshop with an interactive panel of speakers. As an outcome we expect the participants from the NeurIPS community to walk away with better understanding of human in the loop dialogue modeling as well as key areas of research in this field. Additionally we would like to see discussions around the unification of human evaluation setups in some way.

Invited Speakers

Milica Gašić

Heinrich Heine University Düsseldorf

Larry Heck

Viv Labs

Zhou Yu

Columbia University

Important Dates

October 14, 2020 Submissions due (Anywhere on Earth) - Extended Deadline

October 30, 2020 Paper notification

November 15, 2020 Camera-ready papers due

December 11, 2020 Workshop Day at NeurIPS 2020

Accepted Papers

Is the User Enjoying the Conversation? A Case Study on the Impact on the Reward Function. [PDF]

Lina M Rojas

Improving Dialogue Breakdown Detection with Semi-Supervised Learning [PDF]

Nathan H Ng, Marzyeh Ghassemi, Narendran Thangarajan, Jason Pan, Qi Guo

Dialog Simulation with Realistic Variations for Training Goal-Oriented Conversational Systems [PDF]

Chien-Wei Lin, Vincent Auvray, Daniel Elkind, Arijit Biswas, Maryam Fazel-Zarandi, Nehal Belgamwar, Shubhra Chandra, Matt Zhao, Angeliki Metallinou, Tagyoung Chung, Charlie Shucheng Zhu, Suranjit Adhikari, Dilek Hakkani-tur

An Application-Independent Approach to Building Task-Oriented Chatbots with Interactive Continual Learning [PDF] [Supplementary]

Sahisnu Mazumder, Bing Liu, Shuai Wang, Sepideh Esmaeilpour

CheerBots: Chatbots toward Empathy and Emotion using Reinforcement Learning [PDF] [Supplementary]

Jiun-Hao Jhan, Chao-Peng Liu, Shyh-Kang Jeng, Hung-yi Lee

Active Hybrid Classification [PDF]

Evgeny Krivosheev, Fabio Casati, Alessandro Bozzon

Diluted Near-Optimal Expert Demonstrations for Guiding Dialogue Stochastic Policy Optimisation [PDF]

Thibault Cordier, Tanguy Urvoy, Lina M Rojas, Fabrice Lefevre

The Lab vs The Crowd: An Investigation into Data Quality for Neural Dialogue Models [PDF]

José David Lopes, Francisco Javier Chiyah Garcia, Helen Hastie

Efficient Evaluation of Task Oriented Dialogue Systems [PDF]

Weiyi Lu, Yi Xu, Erran Li

Automatic Feedback Generation for Dialog-Based Language Tutors Using Transformer Models and Active Learning [PDF] [Supplementary]

Katherine Stasaski, Vikram Ramanarayanan

Large-scale Hybrid Approach for Predicting User Satisfaction with Conversational Agents [PDF]

Dookun Park, Hao Yuan, Dongmin Kim, Yinglei Zhang, Spyros Matsoukas, Young-Bum Kim, Ruhi Sarikaya, Chenlei Guo, Yuan Ling, Kevin Quinn, Pham Hung, Benjamin Yao, Sungjin Lee

Evaluate on-the-job learning dialogue systems and a case study for natural language understanding [PDF]

Mathilde Veron, Sophie Rosset, Olivier Galibert, Guillaume Bernard

Open-domain Topic Identification of Out-of-domain Utterances using Wikipedia [PDF]

Alexandry Augustin, Alexandros Papangelis, Margarita Kotti, Pavlos Vougiouklis, Jonathon Hare, Norbert Braunschweiler

Towards Teachable Conversational Agents [PDF]

Nalin Chhibber, Edith Law

NICE: Neural Image Commenting Evaluation with an Emphasis on Emotion and Empathy [PDF] [Supplementary]

Kezhen Chen, Qiuyuan Huang, Daniel McDuff, Jianfeng Wang, Hamid Palangi, Xiang Gao, Kevin Li, Ken Forbus, Jianfeng Gao

Interactive Teaching for Conversational AI [PDF]

Qing Ping, Feiyang Niu, Joel Chengottusseriyil, Aishwarya Reganti, Qiaozi Gao, Prashanth Rajagopal, Govindarajan S Thattai, Gokhan Tur, Dilek Hakkani-tur, Prem Natarajan