SLaTE 2019
8th Workshop on Speech and Language Technology in Education
8th Workshop on Speech and Language Technology in Education
A summary of the SLaTE 2019 schedule is presented immediately below and details about which presentations are included in each session are presented further down on the page. Note that the oral presentation slots are 25 minutes with a target of 20 minutes for the presentation and 5 minutes for discussion.
8:00 - 8:30: Registration
8:30 - 8:45: Welcome remarks
8:45 - 9:45: Keynote Presentation (Martin Russell)
9:45 - 10:15: Coffee break
10:15 - 11:55: Paper Session I (Spoken CALL Shared Task)
11:55 - 13:00: Lunch (buffet lunch provided to SLaTE attendees)
13:00 - 14:15: Paper Session II
14:15 - 14:45: Coffee break
14:45 - 15:45: Demo Session
15:45 - 16:15: Sponsor presentations
16:15 - 17:00: SLaTE General Assembly and Panel Discussion
17:00 - 22:00: SLaTE banquet for all registered participants at the Buschenschank Dorner in the beautiful "Styrian Tuscany" wine region (bus transportation will be provided)
8:30 - 9:45: Paper Session III
9:45 - 11:00: Coffee break and Poster Session
11:00 - 12:00: Keynote presentation (Dorothy Chun)
12:00 - 12:45: Lunch (buffet lunch provided to SLaTE attendees)
12:45 - 14:00: Paper Session IV
14:00 - 14:15: Coffee break
14:15 - 15:30: Paper Session V
15:30 - 15:45: Closing remarks
Martin Russell, Professor of Information Engineering, School of Computer Science, University of Birmingham
This talk is about understanding, from the perspective of speech science, how the DNNs that are used in ASR represent speech. I will begin with results of experiments that indicate that visualization of the patterns of activity in a low dimensional “bottleneck” layer of a DNN can be interpreted in phonetic terms. Then I will consider evidence that this interpretation is robust and useful, in that it is maintained in a topological sense across different versions of the same DNN trained on the same and different speech corpora. Next I will consider how these results might be exploited to provide phonetically useful feedback in SLaTE applications and I will propose a particular example from language acquisition in children.
In the final part of the talk I will briefly discuss how these results suggest that a particular type of mathematical object, called a topological manifold, might provide a new model of “acoustic speech space” and present the results from ASR experiments using a variant of the basic DNN-HMM structure that is inspired by these ideas.
Dorothy Chun, Professor of Education, University of California Santa Barbara
Claudia Baur, Andrew Caines, Cathy Chua, Johanna Gerlach, Mengjie Qian, Manny Rayner, Martin Russell, Helmer Strik and Xizi Wei. Overview of the 2019 Spoken CALL Shared Task.
Daniele Falavigna, Roberto Gretter and Marco Matassoni. The FBK system for the 2019 Spoken CALL Shared Task.
Mengjie Qian, Peter Jancovic and Martin Russell. The University of Birmingham 2019 Spoken CALL Shared Task Systems: Exploring the importance of word order in text processing.
Volodymyr Sokhatskyi, Olga Zvyeryeva, Ievgen Karaulov and Dmytro Tkanov. Embedding-based system for the Text part of CALL v3 shared task.
Adriana Guevara-Rukoz, Alexander Martin, Yutaka Yamauchi and Nobuaki Minematsu. Prototyping a web-based phonetic training game to improve /r/-/l/ identification by Japanese learners of English.
Lei Chen, Qianyong Gao, Qiubing Liang, Jiahong Yuan and Yang Liu. Automatic Scoring Minimal-Pair Pronunciation Drills by Using Recognition Likelihood Scores and Phonological Features.
Aparna Srinivasan, Chiranjeevi Yarra and Prasanta Kumar Ghosh. Automatic assessment of pronunciation and its dependent factors by exploring their interdependencies using DNN and LSTM
Chiranjeevi Yarra and Prasanta Kumar Ghosh. voisTUTOR: Virtual Operator for Interactive Spoken English TUTORing.
Elham Akhlaghi Baghoojari, Branislav Bédi, Matthias Butterweck, Cathy Chua, Johanna Gerlach, Hanieh Habibi, Junta Ikeda, Manny Rayner, Sabina Sestigiani and Ghil'Ad Zuckermann. Demonstration of LARA: A Learning and Reading Assistant.
Ralph Rose. Fluidity: Developing second language fluency with real-time feedback during speech practice.
Gary Yeung, Alison L. Bailey, Amber Afshan, Morgan Tinkler, Marlen Q. Pérez, Alejandra Martin, Anahit A. Pogossian, Samuel Spaulding, Hae Won Park, Manushaqe Muco, Abeer Alwan and Cynthia Breazeal. A robotic interface for the administration of language, literacy, and speech pathology assessments for children.
Zhenchao Lin, Yusuke Inoue, Tasavat Trisitichoke, Shintaro Ando, Daisuke Saito and Nobuaki Minematsu. Native Listeners' Shadowing of Non-native Utterances as Spoken Annotation Representing Comprehensibility of the Utterances.
Wei Xue, Catia Cucchiarini, Roeland van Hout and Helmer Strik. Acoustic correlates of speech intelligibility: the usability of the eGeMAPS feature set for atypical speech.
Johanna Dobbriner and Oliver Jokisch. Implementing and evaluating methods of dialect classification on read and spontaneous German speech.
Jorge Proença, Ganna Raboshchuk, Ângela Costa, Paula Lopez-Otero and Xavier Anguera. Teaching American English pronunciation using a TTS service.
Fred Richardson, John Steinberg, Gordon Vidaver, Steve Feinstein, Ray Budd, Jennifer Melot, Paul Gatewood and Douglas Jones. Corpora Design and Score Calibration for Text Dependent Pronunciation Proficiency Recognition.
Sweekar Sudhakara, Manoj Kumar Ramanathi, Chiranjeevi Yarra, Anurag Das and Prasanta Kumar Ghosh. Noise robust goodness of pronunciation (GoP) measures using teacher's utterance.
Yiting Lu, Katherine Knill, Mark Gales, Potsawee Manakul and Yu Wang. Disfluency Detection for Spoken Learner English.
Chiranjeevi Yarra, Manoj Kumar Ramanathi and Prasanta Kumar Ghosh. Comparison of automatic syllable stress detection quality with time-aligned boundaries and context dependencies.
Ray Budd, Tamas Marius, Doug Jones and Paul Gatewood. Using K-Means in SVR-Based Text Difficulty Estimation.
Prasanna Kothalkar, Dwight Irvin, Ying Luo, Joanne Rojas, John Nash, Beth Rous and John Hansen. Tagging child-adult interactions in naturalistic, noisy, daylong school environments using i-vector based diarization system.
Neasa Ní Chiaráin and Ailbhe Ní Chasaide. An Scéalaí: autonomous learners harnessing speech and language technologies.
Elham Akhlaghi Baghoojari, Branislav Bédi, Matt Butterweck, Cathy Chua, Johanna Gerlach, Hanieh Habibi, Junta Ikeda, Manny Rayner, Sabina Sestigiani and Ghil'Ad Zuckermann. Overview of LARA: A Learning and Reading Assistant.
Erika Godde, Gerard Bailly and Marie-Line Bosse. Reading Prosody Development: Automatic Assessment for a Longitudinal Study.
Mohamed El Hajji, Morgane Daniel and Lucile Gelin. Transfer Learning based Audio Classification for a noisy and speechless recordings detection task, in a classroom context.
Helmer Strik, Anna Ovchinnikova, Camilla Giannini, Angela Pantazi and Catia Cucchiarini. Student’s acceptance of MySpeechTrainer to improve spoken academic English.
Satoshi Kobashikawa, Atushi Odakura, Takao Nakamura, Takeshi Mori, Kimitaka Endo, Takafumi Moriya, Ryo Masumura, Yushi Aono and Nobuaki Minematsu. Does Speaking Training Application with Speech Recognition Motivate Junior High School Students in Actual Classroom? -- A Case Study.