Invited Speakers

Hynek Hermansky

Professor Emeritus,

Department of Electrical and Computer Engineering, Johns Hopkins University, USA.

Hynek Hermansky received a M.S. in Electrical Engineering (1972) from Technical University Brno, Czech Republic and a Ph.D. in Electrical Engineering (1983) from University of Tokyo, Japan. He has been at the forefront of groundbreaking research in human hearing and speech technology research for more than three decades, both in industrial research labs and in academia.  The main focus of Hermansky’s is on using bio-inspired methods to recognize information in speech-related signals. Hermansky currently holds the position of Research Professor at Brno University of Technology in the Czech Republic. He is Julian S. Smith Professor Emeritus at Johns Hopkins University where he was for ten years leading an internationally acclaimed group of Johns Hopkins faculty, students, and visiting researchers at the Center for Language and Speech Processing (CLSP), which comprises one of the largest and most prestigious speech and language-oriented academic groups in the world. His past affiliations include the director of research at IDIAP Research Institute, Martigny, Switzerland (2003-2008), titular professor at the Swiss Federal Institute of Technology in Lausanne, Switzerland (2005-2008), professor at the Oregon Health and Sciences University (previously Oregon Graduate Institute), a senior member of the Research Staff at the U.S. WEST Advanced Technologies in Boulder, CO, and research engineer at Panasonic Technologies in Santa Barbara, California.

His achievements include more than 300 peer-reviewed papers with more than 20,000 citations, and 13 patents, with another eight pending applications in topics such as a method for identifying keywords in machine recognition of speech based on the detection and classification of sparse speech sound events; a system to compute speech recognition for cell phone; and an auditory model to detect speech corrupted by additional background noises. Hermansky’s scientific contributions were recognized by the Institute of Electrical and Electronics Engineers (IEEE), which awarded him awarded him the 2021 James L. Flanagan Speech and Audio Processing Medal, and the International Speech Communication Association (ISCA), which awarded him in 2013 its highest honor, the Medal for Scientific Achievement. Hermansky’s service to the field is extensive and noteworthy. He is a Life Fellow of the IEEE, a Fellow of the ISCA, and an External Fellow of the International Computer Science Institute. Highly sought-after by the industry for his expertise, he is a current member of the advisory board for Germany’s Hearing4All Scientific Consortium Center of Excellence in Hearing Research, and he has served on advisory boards for Amazon, Audience, Inc., and VoiceBox Inc. His professional memberships include IEEE and ISCA, where he was twice elected as a board member. He is a member of the editorial board of Speech Communication, and he also was an associate editor for IEEE Transaction on Speech and Audio and a former member of the editorial boards for Phonetica.  Hermansky serves in leadership roles for the field’s key workshops and conferences presents invited lectures and keynote presentations around the globe and were lecturing worldwide as the Distinguished Lecturer for ISCA and for IEEE. Hermansky is the General Chair of INTERSPEECH 2021 in Brno, Czech Republic, was a General Chair of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), and chair of the technical committee for the ICASSP 2000.  In addition to leading several Hopkins’ CLSP workshops, he was also on the organizational committee for ASRU 2017, ASRU2013  and ASRU 2005, for ten years was the executive chair of the annual ISCA-sponsored workshops on Text, Speech, and Dialogue in the Czech Republic, and was a tutorial speaker at INTERSPEECH 2015.

Bhuvana Ramabhadran

Speech Recognition Researcher,

Google Research, USA.

Bhuvana Ramabhadran received her Ph.D. degree in electrical engineering from the University of Houston. Currently, she leads a team of researchers at Google, focusing on semi-supervised learning for speech recognition and multilingual speech recognition. Previously, she was a Distinguished Research Staff Member and Manager in IBM Research AI, at the IBM T. J. Watson Research Center, Yorktown Heights, NY, USA, where she led a team of researchers in the Speech Technologies Group and coordinated activities across IBM’s world wide laboratories in the areas of speech recognition, synthesis, and spoken term detection. She has served as the principal investigator on two major international projects: the National Science Foundation (NSF) sponsored Multilingual Access to Large Spoken Archives (MALACH) project, and the European Union (EU) sponsored TCSTAR project, and the lead with IBM for the Spoken Term Detection evaluation in 2006. She was responsible for acoustic and language modeling research for both, commercial and government projects ranging from voice search and transcription tasks to spoken term detection in multiple languages and expressive synthesis for IBM Watson. She has served as an elected member of the IEEE Signal Processing Society (SPS) Speech and Language Technical Committee (SLTC) (2015-2017), for two terms since 2010 and as its elected Vice Chair and Chair (2014–2016), and currently serves as an Advisory Member. She has served as the Area Chair for ICASSP (2011–2018),  INTERSPEECH (2012, 2014-2016), on the editorial board of the IEEE Transactions on Audio, Speech, and Language Processing (2011–2015), and on the IEEE SPS conference board (2017-2018) during which she also served as the conference board’s liaison with the ICASSP organizing committees, and as Regional Director-At-Large (2018-2020), where she coordinated work across all US IEEE chapters. She also organized IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) in 2011.

She currently serves as the Chair of the IEEE Flanagan Speech & Audio Award Committee, and currently serves as a Member-at-Large of the IEEE SPS Board of Governors (BoG). She serves on the International Speech Communication Association (ISCA) board in her capacity as ISCA Vice President (2023-2025). In addition to organizing several workshops at International Conference on Machine Learning (ICML), HLT-NAACL Neural Information Processing Systems (NIPS) and ICML, she has also served as an adjunct professor at Columbia University, where she co-taught a graduate course on speech recognition.  She has served as the (Co/-)Principal Investigator on several projects funded by the NSF, EU, and  Intelligence Advanced Research Projects Activity (IARPA), spanning speech recognition, information retrieval from spoken archives, keyword spotting in many languages. She has published over 150 papers and been granted over 40 U.S. patents. Her research interests include speech recognition and synthesis algorithms, statistical modeling, signal processing, and machine learning.  Some of her recent work has focused on the use of speech synthesis to improve core speech recognition performance and self-supervised learning. She is a Fellow of IEEE and a Fellow of the ISCA. 

Mathew Magimai Doss

Senior Researcher,

IDIAP Research Institute, Martigny, Switzerland.

Dr. Mathew Magimai Doss received the Bachelor of Engineering (B.E.) in Instrumentation and Control Engineering from the University of Madras, India in 1996; the Master of Science (M.S.) by Research in Computer Science and Engineering from the Indian Institute of Technology, Madras,India in 1999; the PreDoctoral diploma and the Docteur ès Sciences (Ph.D.) from the Ecole polytechnique fédérale de Lausanne (EPFL), Switzerland in 2000 and 2005, respectively. He was a postdoctoral fellow at the International Computer Science Institute (ICSI),Berkeley, USA from April 2006 till March 2007. He is now a Senior Researcher at the Idiap Research Institute, Martigny, Switzerland. He is also a lecturer at EPFL. His main research interest lies in signal processing, statistical pattern recognition, artificial neural networks and computational linguistics with applications to speech and audio processing, sign language processing and multimodal signal processing. He is a member of IEEE, ISCA, and Sigma Xi. He is an Associate Editor of the IEEE/ACM Transactions on Audio, Speech, and Language Processing. He is the coordinator of SNSF Sinergia project SMILE-II, which is focusing on continuous sign language recognition, generation and assessment. He was the coordinator of recently completed H2020 Marie Sklodowska Curie Actions ITN-ETN project TAPAS, which focused on pathological speech processing. He has published over 180 journal and conference papers. The Speech Communication paper "End-to-end acoustic modeling using Convolution Neural Networks for HMM-based Automatic Speech Recognition" co-authored by him and published in 2019 received the  2023 EURASIP Best Paper Award for SPEECH COMMUNICATION Journal and ISCA Award for the Best Paper published in Speech Communication (2017-2021). The Interspeech 2015 paper "Objective Intelligibility Assessment of Text-to-Speech Systems Systems through Utterance Verfication" received one of three best student paper award. 

Chng Eng Siong

Associate Professor, 

Nanyang Technological University (NTU), Singapore.

Dr. Chng Eng Siong is currently an Associate Professor in the School of Computer Science and Engineering (SCSE) at Nanyang Technological University (NTU) in Singapore. Prior to joining NTU in 2003, he worked at Knowles Electronics (USA), Lernout and Hauspie (Belgium), the Institute of Infocomm Research (I2R) in Singapore, and RIKEN in Japan. He received both a Ph.D. and a BEng (Hons) from the University of Edinburgh, U.K., in 1996 and 1991, respectively, specializing in digital signal processing. His areas of expertise include LLM, speech research, machine learning, and speech enhancement.

He currently serves as the Principal Investigator (PI) of the AI-Singapore Speech Lab from 2023 to 2025. Throughout his career, he has secured research grants from various institutions, including NTU-Rolls Royce, Mindef, MOE, and AStar. These grants, totaling over S$12 million, were awarded under the “Speech and Language Technology Program (SLTP)” in the School of Computer Science and Engineering (SCSE) at NTU. In recognition of his expertise, he was awarded the Tan Chin Tuan fellowship in 2007 to conduct research at Tsinghua University in Fang Zheng’s lab. Additionally, he received the JSPS travel grant award in 2008 to visit Tokyo Institute of Technology in Furui’s Lab.

He has supervised the graduation of over 15 PhD students and 8 Masters students. His publication record includes 2 edited books and over 200 journal and conference papers. Additionally, he has contributed to the academic community by serving as the publication chair for 5 international conferences, including Human Agent Interaction 2016, INTERSPEECH 2014, APSIPA-2010, APSIPA-2011, and ISCSLP-2006. Furthermore, he has played an integral role in the local organizing committee for ASRU 2019 and SLT 2024.

Dr. Bayya Yegnanarayana is currently INSA Senior Scientist at IIIT Hyderabad. He was Professor Emeritus at BITS-Pilani Hyderabad Campus during 2016. He was an Institute Professor from 2012 to 2016 and Professor & Microsoft Chair from 2006 to 2012 at the International Institute of Information Technology (IIIT) Hyderabad. He was a professor at IIT Madras (1980 to 2006), a visiting associate professor at Carnegie-Mellon University, Pittsburgh, USA (1977 to 1980), and a member of the faculty at the Indian Institute of Science (IISc), Bangalore, (1966 to 1978). He received BSc from Andhra University in 1961, and BE, ME and PhD from IISc Bangalore in 1964, 1966, and 1974, respectively. His research interests are in signal processing, speech, image processing and neural networks. He has published over 400 papers in these areas. He is the author of the book "Artificial Neural Networks", published by Prentice-Hall of India in 1999. He has supervised 34 PhD and 42 MS theses. He is a Fellow of the Indian National Academy of Engineering (INAE), a Fellow of the Indian National Science Academy (INSA), a Fellow of the Indian Academy of Sciences (IASc), a Fellow of the IEEE (USA), and a Fellow of the International Speech Communications Association (ISCA). He was the recipient of the 3rd  IETE Prof. S. V. C. Aiya Memorial Award in 1996. He received the Prof. S. N. Mitra Memorial Award for the year 2006 from INAE. He was awarded the 2013 Distinguished Alumnus Award from IISc Bangalore. He was awarded "The Sayed Husain Zaheer Medal (2014)" of INSA in 2014. He received Prof. Rais Ahmed Memorial Lecture Award from the Acoustical Society of Inida in 2016. He was an Associate Editor for the IEEE Transactions on Audio, Speech and Language Processing during 2003-2006.  He received Doctor of Science (Honoris Causa) from Jawaharlal Nehru Technological University Anantapur in February 2019. He was the General Chair for Interspeech2018 held in Hyderabad, India, during September 2018.

C. V. Jawahar

Dean of Research and Development

International Institute of Information Technology, Hyderabad

Prof. C. V. Jawahar is a professor and Head of the Centre for Visual Information Technology-CVIT, at the International Institute of Information Technology, Hyderabad (IIITH), India. At IIIT Hyderabad, he leads the research group focusing on computer vision, machine learning, and multimedia systems.  In recent years, he has been actively involved in research questions in Computer Vision with emphasis on mobility, healthcare, and Indian language computing.. He is also interested in large-scale multimedia systems with a special focus on assistive technology solutions. Prof. Jawahar is an elected Fellow of the Indian National Academy of Engineers (INAE) and the International Association of Pattern Recognition(IAPR). His research is globally recognized in the Artificial Intelligence and Computer Vision research community with more than 200 publications in top tier conferences and journals in computer vision, robotics and document image processing to his credit with over 18000 citations. He is awarded the ACM India Outstanding Contribution to Computing Education (OCCE) 2021. He is actively engaged with several government agencies, ministries, and leading companies around innovating at scale through research.

Sriram Ganapathy

Associate Professor of Electric Engineering,  Indian Institute of Science, Bangalore 

& Google Research, Bangalore, India

Sriram Ganapathy (Senior Member, IEEE) received the Bachelor of Technology degree from the College of Engineering, Trivandrum, India, the Master of Engineering degree from the Indian Institute of Science, Bangalore, India, and the Doctor of Philosophy degree from the Center for Language and Speech Processing, Johns Hopkins University, Baltimore, MD, USA. He is currently an Associate Professor with the Electrical Engineering, Indian Institute of Science where he heads the activities of the learning and extraction of acoustic patterns (LEAP) Lab. He is also associated with the Google Research India, Bangalore. Prior to joining the Indian Institute of Science, he was a Research Staff Member with the IBM Watson Research Center, Yorktown Heights, NY, USA. From 2006 to 2008, he also was a Research Assistant with Idiap Research Institute, Martigny, Switzerland. At the LEAP Lab, his research interests include signal processing, machine learning methodologies for speech and speaker recognition, and auditory neuro-science. He is the Subject Editor of the Speech Communications journal.

Preethi Jyothi

Associate Professor of Computer Science and Engineering, 

Indian Institute of Technology Bombay 

Preethi Jyothi is an Associate Professor in the CSE department, IIT Bombay. She was a Beckman Postdoctoral Fellow at the University of Illinois at Urbana-Champaign from 2013-2016. She received her Ph.D. in computer science from The Ohio State University. She obtained a B.Tech from the National Institute of Technology, Calicut in 2006, where she was awarded the gold medal for being the top graduating student in computer science.  Her research interests are broadly in the areas of machine learning as applied to speech and exploring the interaction between speech and text. Her Ph.D. thesis dealt with statistical learning methods for pronunciation models. Her work on this topic received a Best Student Paper Award at INTERSPEECH, 2012. She co-organised a research project on probabilistic transcriptions at the 2015 Jelinek Summer Workshop on Speech and Language Technology. For this work, her team received a Speech and Language Processing Student Paper Award at ICASSP 2016. Since joining IITB, she was awarded a Google Faculty Research Award 2017 for her proposal on accented speech recognition. She also led a team that received the "Best Project" award at Microsoft Research India’s Summer Workshop on Artificial Social Intelligence in 2017. She currently serves on the ISCA SIGML board, and is a member of the Editorial Board of Computer Speech and Language, Elsevier. 

Aparna Walanj

Senior Manager,  Medical Research Department, 

Kokilaben Dhirubhai Ambani Hospitals and Research Center, Mumbai. 

Aparna Walanj received MBBS, DCH, PGDCR, and PGDBA. Presently, she is Senior Manager at Medical Research Department, Kokilaben Dhirubhai Ambani Hospitals and Research Center, Mumbai. She is also visiting faculty for several clinical research institutes in Mumbai. She has 15 years of clinical research experience, supervising and assisting in the conduct of clinical research studies in various organizations, such as HCAH, Sarathi, Unichem Labs, Ethika Clinical Research, and Sapphire Hospitals. She support in training and guidance of research coordinators related to the research processes, guidelines, and regulations; conducting review of patient documents and approving patients as per set criteria for eligibility in the research programs, and cordinating with consultants and research experts from CROs and sponsors conducting domestic and global clinical studies. She has expertise in developing Investigator Site and Ethics Committee SOPs, and conducted trainings and workshops on ICH GCP for Investigators and Research Site staff. She is supervising the research team for all activities from study site feasibility till study close out and overseeing the quality audits, sponsor visits, and various accreditation audits related to clinical research. She is member of Indian Society of Clinical Research, and Rotary Club of Thane Green City (RCTGC) and served RCTGC in various capacities, such as President, Secretary, E Administrator, and Environment Director

Hemant A. Patil received a Ph.D. degree from the Indian Institute of Technology (IIT), Kharagpur, India, in July 2006. Since 14 Feb. 2007, he has been a faculty member at DA-IICT Gandhinagar, India and developed Speech Research Lab recognized as ISCA Speech Labs at DA-IICT. He has published/submitted around 320+ research publications in international conferences/journals/book chapters. He visited the department of ECE, University of Minnesota, Minneapolis, USA (May-July, 2009) as a short-term scholar. He has been associated (as PI) with three MeitY-sponsored projects in ASR, TTS, and QbESTD. He was co-PI for DST sponsored project on India-Digital Heritage (IDH)-Hampi. His research interests include speech and speaker recognition, analysis of spoofing attacks, audio deepfake detection, TTS, and Assistive Speech Technologies, such as infant cry and dysarthric speech classification and recognition. He has received the DST Fast Track Award for Young Scientists for infant cry analysis. He has coedited four books with Dr. Amy Neustein (EIC, IJST Springer) with titles, Forensic Speaker Recognition (Springer, 2011), Signal and Acoustic Modeling for Speech and Communication Disorders (DE GRUYTER, 2018), Voice Technologies for Speech Reconstruction and Enhancement (DE GRUYTER, 2020), and Acoustic Analysis of Pathologies from Infant to Young Adulthood (DE GRUYTER, 2020). Recently, he is selected as Associate Editor for IEEE Signal Processing Magazine (2021-2023).  Prof. Patil has also served as a PRSG Member for three MeitY-sponsored projects, namely, “Speech-to-Speech Translation & Performance Measurement Platform for Broadcast Speeches and Talks (e.g., Mann Ki Baat)”, “Indian Languages Speech Resources Development for Speech Applications”, and “Integration of 13 Indian Languages TTS Systems with Screen readers for Windows, Linux, and Android Platforms”.

Dr. Patil has taken a lead role in organizing several ISCA-supported events@DA-IICT, such as summer/winter schools/CEP workshops. Dr. Patil has supervised 08 doctoral and 56 M.Tech. theses (all in the speech processing area). Presently, he is mentoring 01 doctoral scholar and 01 MTech student. Dr. Patil is also co-supervising UG and master's students jointly as part of the Samsung PRISM program at DA-IICT. Recently, he offered a joint tutorial with Prof. Haizhou Li (IEEE Fellow and ISCA Fellow) during APSIPA ASC 2017, and INTERSPEECH 2018. He offered a joint tutorial with Prof. Heidiki Kawahara (IEEE Fellow and ISCA Fellow) on the topic, “Voice Conversion: Challenges and Opportunities,” during APSIPA ASC 2018, Honolulu, USA. He spent his Sabbatical Leave at Samsung R&D Institute, Bengaluru from May 2019 to August 2019. He has been selected as an APSIPA Distinguished Lecturer (DL) for 2018-2019, and he has 25+ APSIPA DLs in four countries, namely, India, Singapore, China, and Canada. Recently, he was selected as an ISCA Distinguished Lecturer (DL) for 2020-2022 and delivered 28+ ISCA DLs in India, USA, and Malaysia.