Keynote Speakers

Keynote Speaker 1

Nobuaki MINEMATSU

Full professor, Department of Electrical Engineering and Information Systems

Graduate School of Engineering, The University of Tokyo (UTokyo)

Biography

Nobuaki MINEMATSU earned the doctor of Engineering in 1995 from The University of Tokyo (UTokyo) and since 2012, he has been a full professor there. From 2002 to 2003, he was a visiting researcher at Royal Institute of Technology, Sweden (KTH). He has a wide interest in speech communication covering the areas of speech science and speech engineering, especially he has an expert and practical knowledge on Computer-Aided Language Learning (CALL). He developed the web-based system of OJAD (Online Japanese Accent Dictionary), which is the first and currently only teaching material for prosody control of Japanese word accent changes and sentence intonation generation. It has gained huge popularity and was internationalized into 14 languages. He organized more than 140 OJAD tutorial workshops in 38 countries. He has published more than 450 journal and conference papers. He received paper awards from RISP, JSAI, ICIST, O-COCOSDA, IEICE in 2005, 2007, 2011, 2014, and 2016 and received an encouragement award from PSJ in 2014. He gave tutorial talks on CALL at APSIPA2011, INTERSPEECH2012, PAAL2017, and CASTEL/J2017 and gave many keynote and invited talks at conferences. He was a distinguished lecturer of APSIPA from 2015 to 2016. He has made remarkable contributions to academic societies. He served as secretary of Speech Prosody 2004, secretary of INTERSPEECH2010, co-organizer of SLaTE2010 (L2 workshop 2010), and program chair of O-COCOSDA2018. He is one of the ISCA board members. He is going to serve as general chair of Speech Prosody 2020 in Tokyo. Domestically, he served as editorial chair of IEICE from 2014 to 2016 and chair of SIG-SLP of IPSJ from 2016 to 2017, and has been serving as member of the PSJ council from 2016. He is a member of ISCA, SLaTE, IEEE, IPA, APSIPA, IEICE, IPSJ, ASJ, PSJ, JSAI, and LET.

Keynote speech title:

How can speech technologies support learners to improve their skills of speaking, listening, conversation and more?


Time: 9:40-10:40, October 3 (Thursday), 2019


Abstract:

In the globalization era, not only students but also immigrant workers have to learn new languages for smooth oral communication in those languages. In this talk, the lecturer illustrates how speech technologies, i.e. speech synthesis, speech recognition, voice conversion, etc can support learners to improve their skills of speaking, listening, conversation, and more. Text does not show any prosodic structure explicitly and native speakers use their implicit knowledge on prosodic control to read aloud that text naturally. Implicit knowledge is difficult for teachers to explain explicitly and there-fore prosody training is rare in classrooms. Text-to-speech systems often use a text-based prosody prediction module and this module is used effectively to teach pro-sodic control required to read given texts aloud explicitly to learners. In High Variability Phonetic Training (HVPT), teachers use speech stimuli with different ages, genders, accents, background noises, etc. Being exposed to those variabilities, learners can obtain robust listening skills. However, teachers prepare those stimuli manually. By introducing speech analysis and voice conversion techniques, those variabilities are easily enhanced. In the talk, an interesting example of adversarial training, which was originally used for machine learners and is newly introduced to human learners, and its effectiveness for acquiring robust listening skills are explained. Further, use of speech recognition technologies for shadowing assessment to improve parallel processing skills for conversation is described. In the lecturer’s laboratory, a new project has started to realize a novel speech assessment framework, where not native-likeness but comprehensibility of learners’ speech is mainly focused on for assessment. The lecturer shows recently obtained results of objective measurement of comprehensibility of learners’ speech.

Keynote Speaker 2

Yoshinobu KANO

Associate Professor, Faculty of Informatics, Shizuoka University

Biography

Dr. Yoshinobu Kano is now an associate professor at the Faculty of Informatics, Shizuoka University, Japan. He received BS in physics (2001), MSc (2003) and PhD (2011) in information science and technology from the University of Tokyo, respectively. He served as a research associate in University of Tokyo (2009), JST PRESTO researcher (2011), and an associate professor (PI) in Shizuoka University (2014-). His major research area includes Natural Language Processing (NLP) and Artificial Intelligence. He is interested in human-like natural language processing and its applications in medical, legal and conversational issues. He received many awards include Paper Award (co-author), Association for Nantural Language Processing, "Todai Robot Project: Error Analysis on Mock Exam Challenge" (2016), Takayanagi Research Award (2015), Best Paper Award, International Workshop on Analytics Services on the Cloud (ASC workshop in ICSOC) (2012), and IBM UIMA Innovation Award (2008). His is now working on the research topic "Beyond end-to-end learning: Dialog system, sentence generation, and conversation analysis for automatic mental disorder diagnosis".

Keynote speech title:

Beyond end-to-end learning: Dialog system, sentence generation, and conversation analysis for automatic mental disorder diagnosis


Time: 9:00-10:00, October 4 (Friday), 2019


Abstract:

End-to-end learning with deep learning techniques have been successfully applied to many research areas including NLP. However, there are three critical aspects when we consider our approaches for specific issues: available data size, external knowledge requirements, and accountability. Medical NLP tends to suffer from the critical aspects above. I first overview a couple of related NLP projects that I organize, such as sentence generation, dialog system and legal NLP, to show what sort of issues exist regarding the three aspects. Then I introduce NLP applications in the medical/health fields discussing research issues for current and future works: automatic diagnosis support for mental diseases and developmental disorders from conversation records, and text mining system for the EHRs (Electronic Health Records).