Keynotes

Dr. Jason Weston, Research Scientist, Facebook Research, NY

Putting together the threads of conversational AI?

Abstract: Maybe we don't have enough threads yet to knit together the whole, but let's try anyway! We present our view of what is necessary for conversational AI, and the pieces we have worked on so far to get there. In particular: software (ParlAI, a unified platform for dialogue research), various neural architectures for memory, reasoning, retrieval and generation, and interactive learning, tasks for employing personality (PersonaChat), knowledge (Wizard of Wikipedia) and perception (Image-Chat), evaluation studies & techniques (dialogue NLI), and a discussion of how far we still have to go.

Speaker bio: Jason Weston is a research scientist at Facebook, NY and a Visiting Research Professor at NYU. He earned his PhD in machine learning at Royal Holloway, University of London and at AT&T Research in Red Bank, NJ (advisors: Alex Gammerman, Volodya Vovk and Vladimir Vapnik) in 2000. From 2000 to 2001, he was a researcher at Biowulf technologies. From 2002 to 2003 he was a research scientist at the Max Planck Institute for Biological Cybernetics, Tuebingen, Germany. From 2003 to 2009 he was a research staff member at NEC Labs America, Princeton. From 2009 to 2014 he was a research scientist at Google, NY. His interests lie in statistical machine learning, with a focus on reasoning, memory, perception, interaction and communication. Jason has published over 100 papers, including best paper awards at ICML and ECML, and a Test of Time Award for his work "A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning", ICML 2008 (with Ronan Collobert). He was part of the YouTube team that won a National Academy of Television Arts & Sciences Emmy Award for Technology and Engineering for Personalized Recommendation Engines for Video Discovery. He was listed as the 16th most influential machine learning scholar at AMiner and one of the top 50 authors in Computer Science in Science.

Roberto Pieraccini, Director of Assistant @ Google

Talking to Machines: the Era of the Virtual Assistant

Abstract:

"Ok Google, wake me up tomorrow morning at 7", "What will the weather be like?", "How high is the Eiffel tower?", "When was it built?", "Play Hello", “No, the one by Adele”, "Stop!"

During the past decade, and mostly thanks to the latest advances in machine learning, we have made great strides in machine conversational technology. Speech recognition, natural language understanding and generation, text-to-speech, and dialog management have reached levels of performance and sophistication that enable and justify what we call "personal virtual assistants," with the Google Assistant being an example of that. However, virtual assistants are not new, but we have today for the first time the potential to enhance and simplify the way we interact with the digital world, with our automated home, and communicate with the people in our family or work environment. If you watched the 1987 vision clip “The Knowledge Navigator” you may ask whether we are there yet. But, another question is whether that is the vision we are striving for. What is this the evolution of the current virtual assistants? What are the technological gaps, if any, that hold us back from that vision?

In this talk I will give a brief historical perspective of the evolution of virtual assistants, show live examples of the current capabilities of the Google Assistant, and introduce some of the open research questions.

Speaker bio: Roberto Pieraccini is a technology expert in the fields of automatic speech recognition, natural language understanding and dialog. He graduated in Electrical Engineering from the University of Pisa, Italy, in 1980 with a dissertation in the field of digital data transmission. He then worked as a research scientist at CSELT (Torino, Italy), Bell Laboratories (Murray Hill, NJ), AT&T Laboratories (Florham Park, NJ), SpeechWorks International (now Nuance), and IBM T.J. Watson Research (Yorktown Heights, NY). In 2005 he was the Chief Technology Officer of SpeechCycle, a company specializing in complex technical support systems based on voice recognition technology. In 2012-2013 he was the Director and CEO of the International Computer Science Institute (ICSI), Berkeley, CA. In 2014 he was at Jibo, the pioneer of the first consumer social robots for the home, as its Director of Advanced Conversational Technology. In 2018 he joined Google in Zurich where he is Director of Engineering for the Google Assistant. He is the author of “The Voice in the Machine: Building Computers that Understand Speech,” published by MIT Press in 2012, and of about 150 among publications and patents. He is a Fellow of the IEEE and ISCA (International Speech Communication Association) and the recipient of several industry awards. In 2016 he received the Primi Dieci (First Ten) award from the Italian-American Chamber of Commerce dedicated to the ten most prominent Italian-Americans in science, art, and technology. He is best known for his pioneering work on statistical natural language understanding and reinforcement learning for dialog systems.

Tin Kam Ho, Senior AI scientist at IBM Watson

How can I help you? - Modeling intents in human-machine conversation

Abstract:

Chatbots are increasingly used to handle many common requests in business customer care, to save human agent labor and also for faster responses. To design such a chatbot, business clients need to identify some typical customer requests, describe each as an intent, and associate it with some representative utterances. These are used to train an intent classifier at the host of the conversation service. The bot is most useful if the intents can encapsulate a major portion of the incoming requests, and if the examples can cover all the usual variability in the intents’ expression. How a client should design these intents and create their training data is thus a major issue in business chatbot building.

How should the requests be divided into intents? What conditions should the groupings satisfy? How should the classifier’s behavior be accounted for? How much training data is enough? Which variations should the training data cover? These are non-trivial questions, and investigation into them touches many fundamental issues in statistical classification. I will relate these questions to research topics in modeling natural language semantics, machine learning methods, and classification data complexity.

Further progress in adopting chatbots for business needs depends critically on how easy it is for clients to design the bots. This is in alignment with a grander challenge in AI: how can we best assist humans to transfer their knowledge to machines. I will describe some observations from our experiences, and review several tools in Watson Assistant that are designed to ease the process of intent building. Open challenges in this process call for more research. I will sketch some near-term or long-term goals for the scientific community to consider.

Speaker bio: Tin Kam Ho is a senior AI scientist at IBM Watson, where she leads projects in semantic modeling of natural languages, question answering, and conversational systems. From 1992 to 2014, she was in Bell Labs at Murray Hill, first as a research scientist and later as the Head of Statistics and Learning Research Department. There she contributed to a wide range of basic and applied research topics in pattern recognition, machine learning, data analysis, and computational modeling. She pioneered research in multiple classifier systems and ensemble learning, random decision forests, and data complexity analysis, and led a large effort on ultra-long-haul optical network modeling. She served as Editor-In-Chief for Pattern Recognition Letters in 2004-2010, and as Editor or Associate Editor for other journals including IEEE Transactions on Pattern Analysis and Machine Intelligence, Pattern Recognition, and International Journal on Document Analysis and Recognition. Her work has been honored with the Pierre Devijver Award in statistical pattern recognition, several IBM and Bell Labs awards, and a Young Scientist Award from the International Conference on Document Analysis and Recognition. She received a PhD in Computer Science from SUNY at Buffalo in 1992. She is a Fellow of the IAPR and the IEEE.