Deep Learning for Dialogue Systems
Goal-oriented spoken dialogue systems (SDS) have been the most prominent component in today’s virtual personal assistants (VPAs). Among these VPAs, Microsoft’s Cortana, Apple’s Siri, Amazon Alexa, Google Assistant, and Facebook’s M, have incorporated SDS modules in various devices, which allow users to speak naturally in order to ﬁnish tasks more efficiently. The traditional conversational systems have rather complex and/or modular pipelines. The advance of deep learning technologies has recently risen the applications of neural models to dialogue modeling. Nevertheless, applying deep learning technologies for building robust and scalable dialogue systems is still a challenging task and an open research area as it requires deeper understanding of the classic pipelines as well as detailed knowledge on the benchmark of the models of the prior work and the recent state-of-the-art work. Thus, this tutorial is designed to focus on an overview of the dialogue system development while describing most recent research for building dialogue systems, and summarizing the challenges. We target an audience of students and practitioners who have some deep learning background and want to get more familiar with conversational dialog systems. The goal of this tutorial is to provide the audience with developing trend of the dialogue systems, and a roadmap to get them started with the related work.
Yun-Nung (Vivian) Chen is currently an assistant professor at the Department of Computer Science & Information Engineering, National Taiwan University. She earned her Ph.D. degree from Carnegie Mellon University, where her research interests focus on spoken dialogue system, language understanding, natural language processing, and multi-modal speech application. She received Google Faculty Award 2016, two Student Best Paper Awards from IEEE SLT 2010 and IEEE ASRU 2013, a Student Best Paper Nominee from Interspeech 2012, and the Distinguished Master Thesis Award from ACLCLP. Prior to joining National Taiwan University, she worked in the Deep Learning Technology Center at Microsoft Research Redmond.
Asli Celikyilmaz is currently Senior Researcher at the Deep Learning Research Team at Microsoft Research. She received her Ph.D. at University of Toronto in Canada on uncertainty modeling in evolutionary functions. She continued her post-doctorate research on natural language processing at UC Berkeley and received best paper award at Semantic Computing Conference in 2009. Later she joined Microsoft Research and was among the first researchers who have built the first AI system behind Cortana Conversational Understanding. Her current research focuses on understanding deep neural network models for language understanding, long-text generation, grounded summarization and language understanding in conversational systems. She has been serving as area chair, co-organizer of numerous NLP and ML conferences, such as ACL, NAACL, TACL, AAAI, NIPS, ICML, Interspeech, and IEEE Spoken Language Technologies (SLT).
Dilek Hakkani-Tur is a research scientist at Amazon. Prior to joining Amazon, she was a researcher at Google Research (2016-2018), Microsoft Research (2010-2016), International Computer Science Institute (ICSI, 2006-2010) and AT&T Labs-Research (2001-2005). She received her BSc degree from Middle East Technical Univ, in 1994, and MSc and PhD degrees from Bilkent Univ., Department of Computer Engineering, in 1996 and 2000, respectively. Her research interests include natural language and speech processing, spoken dialogue systems, and machine learning for language processing. She has over 50 patents that were granted and co-authored more than 200 papers in natural language and speech processing. She is the recipient of three best paper awards for her work on active learning for dialogue systems, from IEEE Signal Processing Society, ISCA and EURASIP. She was an associate editor of IEEE Transactions on Audio, Speech and Language Processing (2005-2008), member of the IEEE Speech and Language Technical Committee (2009-2014), area editor for speech and language processing for Elsevier’s Digital Signal Processing Journal and IEEE Signal Processing Letters (2011-2013), and currently serves on ISCA Advisory Council (2015-2018). She is a fellow of IEEE and ISCA.