Deep Learning for Dialogue Systems

COLING 2018 Tutorial


Goal-oriented spoken dialogue systems (SDS) have been the most prominent component in today’s virtual personal assistants (VPAs). Among these VPAs, Microsoft’s Cortana, Apple’s Siri, Amazon Alexa, Google Assistant, and Facebook’s M, have incorporated SDS modules in various devices, which allow users to speak naturally in order to finish tasks more efficiently. The traditional conversational systems have rather complex and/or modular pipelines. The advance of deep learning technologies has recently risen the applications of neural models to dialogue modeling. Nevertheless, applying deep learning technologies for building robust and scalable dialogue systems is still a challenging task and an open research area as it requires deeper understanding of the classic pipelines as well as detailed knowledge on the benchmark of the models of the prior work and the recent state-of-the-art work. Thus, this tutorial is designed to focus on an overview of the dialogue system development while describing most recent research for building dialogue systems, and summarizing the challenges. We target an audience of students and practitioners who have some deep learning background and want to get more familiar with conversational dialog systems. The goal of this tutorial is to provide the audience with developing trend of the dialogue systems, and a roadmap to get them started with the related work.


Yun-Nung (Vivian) Chen is currently an assistant professor at the Department of Computer Science & Information Engineering, National Taiwan University. She earned her Ph.D. degree from Carnegie Mellon University, where her research interests focus on spoken dialogue system, language understanding, natural language processing, and multi-modal speech application. She received Google Faculty Award 2016, two Student Best Paper Awards from IEEE SLT 2010 and IEEE ASRU 2013, a Student Best Paper Nominee from Interspeech 2012, and the Distinguished Master Thesis Award from ACLCLP. Prior to joining National Taiwan University, she worked in the Deep Learning Technology Center at Microsoft Research Redmond.

Asli Celikyilmaz is currently a researcher at the Deep Learning Technology Center at Microsoft Research. Previously, she was a Research Scientist at Microsoft Bing from 2010 until 2016 focusing on deep learning models for scaling natural user interfaces to multiple domains. She has worked as a Postdoc Researcher at the EECS Department of the UC Berkeley from 2008 until 2010. She has worked with researchers at ICSI @ Berkeley during her postdoc research study. She has earned her Ph.D. from University of Toronto, Canada in 2008. Asli’s research interests are mainly machine learning and its applications to conversational dialog systems, mainly natural language understanding and dialog modeling. In the past she has also focused on research areas including machine intelligence, semantic tagging of natural user utterances of human to machine conversations, text analysis, document summarization, question answering, co-reference resolution, to name a few. Currently she is focusing on Reasoning, Attention, Memory Networks as well as Multi-Task and Transfer Learning for Conversational Dialog systems. She has been serving as area chair, co-organizer of numerous NLP and Speech conferences, such as ACL, NAACL, Interspeech, and IEEE Spoken Language Technologies (SLT). Last year, she co-organized a NIPS workshop on Machine Learning for Spoken Language Understanding and Interactions.

Dilek Hakkani-Tur is a research scientist at Google. Prior to joining Google, she was a researcher at Microsoft Research (2010-2016), International Computer Science Institute (ICSI, 2006-2010) and AT&T Labs-Research (2001-2005). She received her BSc degree from Middle East Technical Univ, in 1994, and MSc and PhD degrees from Bilkent Univ., Department of Computer Engineering, in 1996 and 2000, respectively. Her research interests include natural language and speech processing, spoken dialogue systems, and machine learning for language processing. She has over 50 patents that were granted and co-authored more than 200 papers in natural language and speech processing. She is the recipient of three best paper awards for her work on active learning for dialogue systems, from IEEE Signal Processing Society, ISCA and EURASIP. She was an associate editor of IEEE Transactions on Audio, Speech and Language Processing (2005-2008), member of the IEEE Speech and Language Technical Committee (2009-2014), area editor for speech and language processing for Elsevier’s Digital Signal Processing Journal and IEEE Signal Processing Letters (2011-2013), and currently serves on ISCA Advisory Council (2015-2018). She is a fellow of IEEE and ISCA.