PUBLICATIONS

Reports in proceedings of international conferences:

Esaú Villatoro-Tello, Srikanth Madikeri, Juan Zuluaga-Gomez, Bidisha Sharma, Seyyed Saeed Sarfjoo, Iuliia Nigmatulina, Petr Motlicek, Alexei V. Ivanov, Aravind Ganapathiraju, "Effectiveness of Text, Acoustic, and Lattice-based representations in Spoken Language Understanding tasks", Proc. Of ICASSP'2023, The Int. Conf. on Acoustics, Speech and Signal Processing, June 4 -10, 2023, Rhodes, Greece.

Esau Villatoro-Tello, Srikanth Madikeri, Petr Motlicek, Aravind Ganapathiraju and Alexei V. Ivanov, “Expanded Lattice Embeddings for Spoken Document Retrieval on Informal Meetings”, Proc. Of SIGIR’2022, The 45th Int. ACM Conf. On Research and Development in Information Retrieval, July, 11-15, 2022, Madrid, Spain

Alexei V. Ivanov, Kerrick Lindsey, Leo Rub, Marc Ferguson, Scott Plude, Sharath Y. Puttaswamy, Jim Chen, Eswar R. Kadireddy, Shanmugam Vasudevan, Nandan H. Shankaramurth, Christopher K. Wolf, Raaj Prasad, Jim Steele, "Tensorflow-Based Ultra Low-Power Real-Time Unlimited Vocabulary Transcription System", TinyML Summit, Advances in Ultra-Low Power Machine Learning Technologies and Applications, March 20-21 2019, Sunnywale, CA, USA

Y. Qian, J. Tao, D. Suendermann-Oeft, K. Evanini, A. V. Ivanov, V. Ramanarayanan, “Noise and Metadata Sensitive Bottleneck Features for Improving Speaker Recognition with Non-native Speech Input”, Proc. of Interspeech'2016, International Conference, September 8-12, 2016.

V. Ramanarayanan, D. Suendermann-Oeft, P. Lange, R. Mundkowsky, A. V. Ivanov, Zhou Yu, Yao Qian and K. Evanini, "Development of an Audiovisual Database of Human-Machine Conversations for Educational Learning and Assessment Applications", in Proc. of Workshop on Computational Models for Learning Systems and Educational Assessment (CMLA-2016), Las Vegas, June 26, 2016.

A.V. Ivanov, P.L. Lange, D. Suendermann-Oeft, "LVCSR System on a Hybrid GPU-CPU Embedded Platform for Real-Time Dialog Applications", Proc. of 17th Annual SIGdial Meeting on Discourse and Dialogue (SIGDial'2016), September 13-15, 2015, Los Angeles, CA, USA.

A.V. Ivanov, P.L. Lange, D. Suendermann-Oeft "Serving Multiple Concurrent Interactive Sessions with a GPU-based Speech Recognition System", Proc. of GTC'2016, GPU Technology Conference, April 4-7, 2016, San Jose, California, USA

A.V. Ivanov, P.L. Lange, D. Suendermann-Oeft "A GPU-Based Cloud Speech Recognition Server For Dialog Applications", Proc. of GTC'2016, GPU Technology Conference, April 4-7, 2016, San Jose, California, USA

Z. Yu, V. Ramanarayanan, R. Mundkovsky, P.L. Lange, A. W. Black, D. Suendermann-Oeft and A.V. Ivanov "Multimodal HALEF: An Open-Source Modular Web-Based Multimodal Dialog Framework", Proc. of the 7th International Workshop on Spoken Dialogue Systems, IWSDS 2016, January, 12-16, 2016, Saariselkä, Finland

A.V. Ivanov, P.L. Lange, D. Suendermann-Oeft, V. Ramanarayanan, Y. Qian, Z. Yu, J. Tao, "Speed vs. Accuracy: Designing an Optimal ASR System for Spontaneous Non-Native Speech in a Real-Time Application", Proc. of the 7th International Workshop on Spoken Dialogue Systems, IWSDS 2016, January, 12-16, 2016, Saariselkä, Finland

V. Ramanarayanan, Z. Yu, R. Mundkowsky, P. Lange, A.V. Ivanov, A. W. Black and D. Suendermann-Oeft, "A Modular Open-Source Standard-Compliant Dialog System Framework With Video Support", Proc. of ASRU'2015, December, 13-17, 2015, Scottsdale, AZ

A. V. Ivanov, P. L. Lange and D. Suendermann-Oeft, "Fast and Power Efficient Hardware-Accelerated Cloud-Based ASR for Remote Dialog Applications", Proc. of ASRU'2015, December, 13-17, 2015, Scottsdale, AZ (pdf)

Z. Yu, V. Ramanarayanan, D. Suendermann-Oeft, X. Wang, K. Zechner, Lei Chen, J. Tao, Y. Qian, A. V. Ivanov, "Using Bidirectional LSTM Recurrent Neural Networks to Learn High-Level Abstractions of Sequential Features for Automated Scoring of Non-Native Spontaneous Speech", Proc. of ASRU'2015, December, 13-17, 2015, Scottsdale, AZ

A. Loukina, M. Lopez, K. Evanini, D. Suendermann-Oeft, A.V. Ivanov and K. Zechner, "Pronunciation Accuracy and Intelligibility of Non-Native Speech", Proc. of Interspeech'2015, International Conference, September 6-10, 2015.

V. Ramanarayanan, D. Suendermann-Oeft, A. V. Ivanov and K. Evanini "A Distributed Cloud-Based Dialog System for Conversational Application Development", Proc. of 16th Annual SIGdial Meeting on Discourse and Dialogue (SIGDial'2015), September 2-4, 2015, Prague, Czech Republic.

A.V. Ivanov, V. Ramanarayanan, D. Suendermann-Oeft, M. Lopez, K. Evanini and J. Tao "Automated Speech Recognition Technology for Dialogue Interaction with Non-Native Interlocutors", Proc. of 16th Annual SIGdial Meeting on Discourse and Dialogue (SIGDial'2015), September 2-4, 2015, Prague, Czech Republic.

A. Ivanov, "Speech Recognition on GPUs with Open-Source Models: Faster, Better, Cheaper", Proc. of GTC'2015, GPU Technology Conference, March 17-20, 2015, San Jose, California, USA. (poster)

A. Ivanov, "Memory-Efficient Heterogeneous Speech Recognition Hybrid in the GPU-Equipped Mobile Devices", Proc. of GTC'2015, GPU Technology Conference, March 17-20, 2015, San Jose, California, USA. (slides)

Bernstein, J., Ivanov, A.V., Rosenfeld, E., "Benchmarking Automated Text Correction Services", Proc. NLPCS 2014 : 11th International Workshop on Natural Language Processing and Cognitive Science, 27-29th October 2014, Venice, Italy.

A.Ivanov, F. Brugnara, “Making It Fast and Reliable: Speech Recognition with GPUs by Sequential Utilization of Available Knowledge Sources”, Proc. of GTC'2014, GPU Technology Conference, March 24-27, 2014, San Jose, California, USA. (slides)

A.Ivanov, S. Jalalvand, R. Gretter, D. Falavigna, "Phonetic and Anthropometric Conditioning of MSA-KST Cognitive Impairment Characterization System", Proc. of ASRU'2013, Automatic Speech Recognition and Understanding Workshop, Dec. 8-12, 2013, Olomouc, Czech Republic.

A. Ivanov, X. Chen, "Modulation Spectrum Analysis for Speaker Personality Trait Recognition", Proc. of Interspeech'2012, Int. Conf., Sept. 9-13, 2012, Portland, OR, USA.

A. V. Ivanov, G. Riccardi, "Kolmogorov-Smirnov Test for Feature Selection in Emotion Recognition from Speech", Proc. of ICASSP'2012, Kyoto, Japan, March 2012.

A. V. Ivanov, G. Riccardi, A. J. Sporka, J. Franc, "Recognition of Personality Traits from Human Spoken Conversations", Proc. Interspeech'2011, International Conference, 25-31, August, 2011, Florence, Italy.

B. Ludwig, M. Hacker, R. Schaller, B. Zenker, A. V. Ivanov and G. Riccardi,"Tell Me Your Needs: Assistance for Public Transport Users". Proc. of EICS'2011, Pisa, Italy, June 2011.

S. Quarteroni, A. V. Ivanov and G. Riccardi,"Simultaneous Dialog Act Segmentation and Classification from Human-Human Spoken Conversations". Proc. of ICASSP'2011, Prague, Czech Rep., May 2011.

S. Varges, S. Quarteroni, G. Riccardi and A. V. Ivanov, "POMDP Concept Policies and Task Structures for Hybrid Dialog Managemant". Proc. of ICASSP'2011, Prague, Czech Rep., May 2011.

S. Varges, S. Quarteroni, G. Riccardi and A. V. Ivanov, "Investigating Clarification Strategies in a Hybrid POMDP Dialog Manager". Proc. of SIGDial 2010, Tokyo, Japan, September 2010.

A. V. Ivanov, G. Riccardi, "Automatic Turn Segmentation in Spoken Conversations", Proc. Interspeech'2010, International Conference, 26-30, September, 2010, Makuhari, Japan.

A. V. Ivanov, G. Riccardi, S. Ghosh , S. Tonelli, E. Stepanov, "Acoustic Correlates of Meaning Structure in Conversational Speech", Proc. Interspeech'2010, International Conference, 26-30, September, 2010, Makuhari, Japan.

Ivanov A., Petrovsky A. Attentive Signal Processing For Detection Of Sinusoids In Noise // Signal Processing: Algorithms, Architectures, Arrangements, and Applications: Proc. of IEEE Conference, Poznan, Poland, 25 Spetember 2009.

S. Varges, G. Riccardi, A. Ivanov, S. Quarteroni, P. Roberti, On-line Strategy Computation in Spoken Dialog Systems. ICASSP Show'n'Tell, Taipei, Taiwan, April 2009

S. Varges, S. Quarteroni, G. Riccardi, A. V. Ivanov, P. Roberti, Combining POMDPs trained with User Simulations and Rule-based Dialogue Management in a Spoken Dialogue System. Demo paper at: ACL-IJCNLP'09, Singapore, 2009.

S. Varges, G. Riccardi, S. Quarteroni and A. Ivanov, Leveraging POMDPs trained with User Simulations and Rule-based Dialogue Management in a Spoken Dialogue System. Demo paper at: SIGDIAL'09, London, UK.

S. Varges, G. Riccardi, S. Quarteroni and A. V. Ivanov, The Exploration/Exploitation Trade-off in POMDPs for Dialogue Management. ASRU'09, Merano, Italy, 2009.

А.В. Иванов, А.А. Петровский, “Оценка вероятности отдельных символов наблюдения в импульсном отклике модели аудиторного нейрона”, Труды научной конференции МФТИ “Современные проблемы фундаментальных и прикладных наук”, стр. 16-19, 24-25 ноября2006 г. , Москва-Долгопрудный, Россия

Ivanov A.V., Petrovsky A.A. Probability Estimation of Observation Symbols in Impulse Response of Auditory Neuron // Collection of Articles from the annual MIPT scientific conference "Present Fundamental and Applied Problems", pp.16-19, November 24-25, 2006, Moscow (Dolgoprudny) Russia

Ivanov, A.V., Petrovsky, A. A., “First-Order Markov Property of the Auditory Spiking Neuron Model Response”, Proc. of 14th European Signal Processing Conference, EUSIPCO'2006, Florence, Italy, September, 6-8, 2006, 5 P.-

Ivanov, A.V., Petrovsky, A. A., “Markov Coding Strategy of the Simple Spiking Model of Auditory Neuron”, Proc. of World Congress on Computational Intelligence, WCCI'2006, pp. 8351-8358, July, 17-21, 2006, Vancouver, Canada

Ivanov, A.V., Petrovsky, A. A., Neuromorphic audio processing: A model simulation of the way auditory neurons encode signals // Speech and Computer: Proc. 10th International Conference on / SPECOM'2005, pp. 645-648, October, 17-19, 2005, Patras, Greece

Ivanov, A.V., Parfieniuk, M., Petrovsky, A.A. Frequency-Domain Auditory Suppression Modelling (FASM): A WDFT-Based Anthropomorphic Noise-Robust Feature Extraction Algorithm for Speech Recognition // 9th European Conference on Speech Communication and Technology / Interspeech'2005 (Eurospeech), pp. 713-716, September, 4-8, 2005, Lisbon, Portugal

Иванов А.В., Петровский А.А. Моделирование аудиторной суппрессии в частотной области на основе СДПФ для выделения признаков распознавателей речи повышенной эффективности в условиях шумов // Digital Signal Processing & Applications, Int. Conf. on / DSPA'2005, Москва (ИПУ РАН), Россия, 16-18 марта 2005. - C.475-496.

Ivanov A.V., Petrovsky A.A. Auditory Suppression Modeling with WDFT for High Performance Speech Recognition Feature Extraction in Noisy Conditions // Digital Signal Processing & Applications, Int. Conf. on / DSPA'2005, Moscow (IPU RAN), Russia, 16-18 March 2005. - C.475-496.

Ivanov A.V., Petrovsky A.A. Anthropomorphic feature extraction algorithm for speech recognition in adverse environments // Speech and Computer: Proc. 9th International Conference on / SPECOM'2004, St. Petersburg, Russia, 20-22 September 2004. - P.17-21.

Ivanov A., Petrovsky A. Auditory Models for Robust Feature Extraction: Suppression // Signal Processing: Proc. of IEEE Workshop, Poznan, Poland, 10 October 2003. - P.23-28.

Ivanov A.V., Likhachev D.S., Petrovsky A.A. Spiking Neuron Auditory Model for Speech Processing Systems // Systems, Signals and Image Processing: Proc. of 9th International Workshop on, Manchester, UK, 7-8 November 2002.

Ivanov, A., Petrovsky A. Temporal Processing Neural Networks for Speech Recognition // Neural Networks: Proc. International Conference on / ICNN'99, Brest, Belarus, 12-15 October 1999. - P.117-125.

Ivanov A., Petrovsky A. Experiments with Neural Networks for Sequence Recognition in Application to Automatic Speech Recognition // Pattern Recognition and Information Processing: Proc. 5th International Conference on / PRIP'99, Minsk, Belarus, 18-20 May 1999. - P.149-154.

Ivanov A., Petrovsky A. Training Multi-Layer Perceptrons in the problem of Static Phoneme Identification with the use of TIMIT Speech Corpus // Systems, Signals and Image Processing: Proc. 6th International Workshop on, Bratislava, Slovakia, 2-4 June 1999. - P.118-121.

Ivanov A., Petrovsky A. Software Kit for Simulation and Evaluation of the ASR Front-End algorithms based on MatLab Integrated Shell // Komputerowe Wspomaganie Badan Naukowych: Proc. 5th Conf. / V KK KOWBAN'98, Polanica Zdroj, Poland, 15-17 October 1998.

Articles in scientific magazines and article collections:

Z. Yu, V. Ramanarayanan, R. Mundkowsky, P. Lange, A. V. Ivanov, A. W. Black, D. Suendermann-Oeft, Multimodal HALEF: An Open-Source Modular Web-Based Multimodal Dialog Framework // In book: Dialogues with Social Robots, 2017, pp.233-244.

V. Ramanarayanan, D. Suendermann-Oeft, P. Lange, R. Mundkowsky, A. V. Ivanov, Z. Yu, Y. Qian, K. Evanini, Assembling the jigsaw: How multiple open standards are synergistically combined in the HALEF multimodal dialog system // D.Dahl, ed., "Multimodal Interaction with W3C Standards", Springer, 2017. - P. 295-310.

Ivanov, A.V., Petrovsky, A. A., Anthropo- And Neuromorphic Algorithms in Speech Processing // A. Dobrucki, A. Petrovsky, W.Skarbek eds., “New Trends in Audio and Video”, Politechnika Bialostocka, 2006. - P.201-216.

Иванов А.В., Петровский А.А. Марковские Свойства Импульсного Отклика Модели Аудиторного Нейрона // Процессы и методы обработки информации. - Москва: 2006. - С.113-125.

Ivanov A.V., Petrovsky A.A. Markovian Properties of Spiking Neuron Impulse Response // Methods of Information Processing. - Moscow: 2006. - pp.113-125.

Ivanov, A.V., Petrovsky, A. A. Analysis of the IHC Adaptation for the Anthropomorphic Speech Processing Systems // Applied Signal Processing: EURASIP Journal on, vol. 2005, no. 9, pp. 1323 - 1333, June 2005.

Иванов А.В., Петровский А.А. Методы Построения Устройств Распознавания Речи на Базе Гибрида Нейронная Сеть/Скрытая Марковская Модель // Нейрокомпьютеры: разработка и применение. - Москва: 2002. - N 12. - С.26-36.

Ivanov A.V., Petrovsky A.A. Speech Recognition Methods with the Neural Network/Hidden Markov Model Hybrid // Neurocomputers: Development and Application. - Moscow: 2002. - - N 12. - pp.26-36.

Петровский А.А., Серков В.В., Иванов А.В., Башун Я.Н. Психоакустика и обработка речевых сигналов // Радиотехника и Электроника: Сб. ст. - 1999. - Вып.23. - С.110-119.

Petrovsky A.A., Serkov V.V., Ivanov A.V., Baszun J. N. Psychoacoustics and Processing of Speech Signals // Radiotechnica and Electronica: Article Collection - 1999. - Vol. 23. - pp. 110-119

Ivanov A., Petrovsky A. MLPs and Mixture Models for the Estimation of the Posterior Probabilities of Class Membership // Lecture notes in Artificial Intelligence. - Berlin: Springer-Verlag. - 1999.- P.215-218.

Petrovsky A., Ivanov A., Baszun J. An Attempt To Adequately Estimate Intelligibility Of The Speech Perceived Through Cochlear Implant In Noisy Environment Based On The Neural Network Approach // Journal of the University of Applied Sciences Mittweida. - 1999. - N 3. - P.321-330.

Иванов А.В., Петровский А.А. Тренировка многоуровневых перцептронов в задаче статистической идентификации фонем с использованием базы звуковых данных TIMIT // Известия Белорусской Инженерной Академии. - 1998. - N 2(6)/1. - С.46-52.

Ivanov A.V., Petrovsky A.A. Statistical Training of multi-layer perceptrons for Phoneme Identification with TIMIT Speech Corpus // Izvestya Bielorusskoi Engineernoy Academyi. - 1998. - - N 2(6)/1. - pp.46-52.

Preprints:

Ivanov A.V., Petrovsky A.A. A composite physiological model of the inner ear for audio coding. - Berlin: 116th AES Convention, 2004. - 20 p. - (Preprint / 6082).

PhD Thesis:

Иванов А.В. Формирование пространства признаков на основе антропоморфической обработки информации в распознавателях речи в условиях противодействия / диссертация на соискание степени кандидата технических наук, Белорусcкий Государственный Университет Информатики и Радиоэлектроники, 2004

Ivanov A.V. Feature Space Building Based on the Anthropomorphic Information Processing for Speech Recognizers in Adverse Environments / PhD Thesis, Belarussian State University of Informatics & Radioelectronics, 2004

Patents:

F. Weng, A. Ivanov, S. Cradock, “METHODS AND SYSTEMS FOR CONFUSION REDUCTION FOR COMPRESSED ACOUSTIC MODELS”, US Patent US2021/0375270A1, Pub. Date December, 2, 2021.

A. Unruh, W. Yang, B. Jang, S. Cradock, A. Ivanov, F. Weng, S. Choi “VOICE RECOGNITION FOR IMPOSTER REJECTION IN WEARABLE DEVICES”, US Patent US2021/0287674A1, Pub. Date September, 16, 2021.

F. Weng, A. Ivanov, “ADAPTIVE DECODER FOR HIGHLY COMPRESSED GRAPHEME MODEL”, US Patent US2021/0210109A1, Pub. Date July 8, 2021.

V. Ramanarayanan, D. Suendermann-Oeft, P. Lange, A. V. Ivanov, K. Evanini, Y. Qian, Z. Yu, “COMPUTER-IMPLEMENTED SYSTEMS AND METHODS FOR EVALUATING SPEECH DIALOG SYSTEM ENGAGEMENT VIA VIDEO”, US Patent US10607504B1, active since March, 17, 2020, priority date May, 20, 2016.

V. Ramanarayanan, D. Suendermann-Oeft, P. Lange, A. V. Ivanov, K. Evanini, Y. Qian, Z. Yu, “COMPUTER-IMPLEMENTED SYSTEMS AND METHODS FOR A CROWD SOURCE-BOOTSTRAPPED SPOKEN DIALOG SYSTEM”, US Patent US10607504B1, active since March, 31, 2020, priority date September, 25, 2015.

Y. Qian, J. Tao, D. Suendermann-Oeft, K. Evanini, A. V. Ivanou, V. Ramanarayanan, “COMPUTER-IMPLEMENTED SYSTEMS AND METHODS FOR SPEAKER RECOGNITION USING A NEURAL NETWORK”, US Patent US10008209B1, active since June, 26, 2018, priority date September, 25, 2016.

PCT Patent Application PCT/US13/34726 “SYSTEMS AND METHODS FOR AUTOMATED SPEECH AND SPEAKER CHARACTERIZATION”, priority date March, 30, 2012

U.S. Patent Application 13/854,048 “SYSTEMS AND METHODS FOR AUTOMATED SPEECH AND SPEAKER CHARACTERIZATION“, priority date March, 30, 2012

U.S. Preliminary Patent Application 61/618,657 “UNIVERSAL METHOD OF AUTOMATED SPEECH AND SPEAKER CHARACTERIZATION“, priority date March, 30, 2012

Page updated

Google Sites

Report abuse