Publications
Cong-Thanh Do, Shuhei Imai, Rama Doddipatla, Thomas Hain, "Improving accented speech recognition using data augmentation based on unsupervised text-to-speech synthesis", to appear in Proc. 32nd European Signal Processing Conference (EUSIPCO), Lyon, France, August 26-30, 2024.
Mohan Li, Catalin Zorila, Cong-Thanh Do, Rama Doddipatla, "Towards a unified end-to-end language understanding system for speech and text inputs", in Automatic Speech Recognition and Understanding Workshop (ASRU), Taipei, Taiwan, 16-20 December, 2023. [Paper]
Cong-Thanh Do, Rama Doddipatla, Mohan Li, Thomas Hain, "Domain adaptive self-supervised training of automatic speech recognition", in Proc. INTERSPEECH, Dublin, Ireland, 20-24 August, 2023, pp. 4389-4393. [Paper]
Mohan Li, Cong-Thanh Do, Rama Doddipatla, "Cummulative attention based streaming transformer ASR with internal language model joint training and rescoring", in Proc. IEEE ICASSP, 4-10 June, 2023, Rhodes, Greece. [DOI]
Cong-Thanh Do, Mohan Li, Rama Doddipatla, "Multiple-hypothesis RNN-T loss for unsupervised fine-tuning and self-training of neural transducer", in Proc. INTERSPEECH, Incheon, Korea, September 18-22, 2022, pp. 4446-4450 . [DOI] [arXiv]
Cong-Thanh Do, Rama Doddipatla, Thomas Hain, "Multiple-hypothesis CTC-based semi-supervised adaptation of end-to-end speech recognition", in Proc. IEEE ICASSP, Toronto, Canada, June 6-11, 2021, pp. 6978-6982. [DOI] [arXiv]
Shucong Zhang, Cong-Thanh Do, Rama Doddipatla, Erfan Loweimi, Peter Bell, Steve Renals, "Train your classifier first: cascade neural networks training from upper layers to lower layers", in Proc. IEEE ICASSP, Toronto, Canada, June 6-11, 2021, pp. 2750-2754. [DOI] [arXiv]
Cong-Thanh Do, Shucong Zhang, Thomas Hain, "Selective adaptation of end-to-end speech recognition using hybrid CTC/attention architecture for noise robustness", in Proc. 28th European Signal Processing Conference (EUSIPCO), pp. 321-325, Amsterdam, The Netherlands, August 20-24, 2020. [DOI]
Shucong Zhang, Cong-Thanh Do, Rama Doddipatla, Steve Renals, "Learning noise invariant features through transfer learning for robust end-to-end speech recognition", in Proc. IEEE ICASSP, pp. 7024-7028, Barcelona, Spain, May 4-8, 2020. [DOI]
Cong-Thanh Do, "Subband temporal envelope features and data augmentation for end-to-end recognition of distant conversational speech", in Proc. IEEE ICASSP, pp. 6251-6255, Brighton, UK, May 12-17, 2019. [DOI]
Cong-Thanh Do, Yannis Stylianou, "Weighting time-frequency representation of speech using auditory saliency for automatic speech recognition", in Proc. INTERSPEECH, pp. 1591-1595, Hyderabad, India, September 2-6, 2018. [DOI]
Rama Doddipatla, Takehiko Kagoshima, Cong-Thanh Do, Petko Petkov, Catalin Zorila, E. Kim, Daichi Hayakawa, Hiroshi Fujimura, Yannis Stylianou , "The Toshiba entry to the CHiME 2018 challenge", The 5th International Workshop on Speech Processing in Everyday Environments (CHiME 2018), , pp. 41-45, Hyderabad, India, September 07, 2018. [DOI]
Cong-Thanh Do, Yannis Stylianou, "Improved automatic speech recognition using subband temporal envelope features and time-delay neural network denoising autoencoder", in Proc. INTERSPEECH, pp. 3831-3836, Stockholm, Sweden, August 20-24, 2017. [DOI]
Cong-Thanh Do, Marc Evrard, Adrien Leman, Christophe d'Alessandro, Albert Rilliard, Jean-Luc Crebouw, "Objective evaluation of HMM-based speech synthesis system using kullback-leibler divergence", in Proc. INTERSPEECH, pp. 2952-2956, Singapore, September 14-18, 2014. [DOI]
Achintya K. Sarkar, Cong-Thanh Do, Viet-Bac Le, Claude Barras, "Combination of cepstral and phonetically discriminative features for speaker verification", IEEE Signal Processing Letters, vol. 21, no. 9, pp. 1040-1044, September 2014. [DOI]
Cong-Thanh Do, Lori Lamel, Jean-Luc Gauvain, "Speech-to-text development for Slovak, a low-resourced language", in Proc. Fourth International Workshop on Spoken Language Technologies for Under-Resourced Languages (SLTU-2014), pp. 176-182, St. Petersburg, Russia, May 14-16, 2014. [DOI]
Cong-Thanh Do, Claude Barras, Viet-Bac Le, Achintya K. Sarkar, "Augmenting short-term cepstral features with long-term discriminative features for speaker verification of telephone data", in Proc. INTERSPEECH, pp. 2484-2488, Lyon, France, August 25-29, 2013. [DOI]
Cong-Thanh Do, Mohammad J. Taghizadeh, Phil N. Garner, "Combining cepstral normalization and cochlear implant-like speech processing for microphone array-based speech recognition", in Proc. IEEE Spoken Language Technology (SLT) Workshop, pp. 137-142, Miami, FL, USA, December 2-5, 2012. [DOI]
Achintya K. Sarkar, Viet-Bac Le, Cong-Thanh Do, Anidya Roy, Claude Barras, Lori Lamel, Jean-Luc Gauvain, "LIMSI/VOCAPIA system for NIST NRE 2012", in Proc. 2012 NIST Speaker Recognition Evaluation Workshop, pp. 11-12, Orlando, Florida, USA, December 2012. [DOI]
Cong-Thanh Do, Claude Barras, "Cochlear implant-like processing of speech signal for speaker verification", in Proc. SAPA-SCALE Conference, pp. 17-21, Portland, OR, USA, September 7-8, 2012. [DOI]
Cong-Thanh Do, "Acoustic simulations of cochlear implants in human and machine hearing research", in book Cochlear Implant Research Updates, pp. 117-136, InTech Publisher, April 2012. [Book chaper]
Cong-Thanh Do, Dominique Pastor, Andre Goalic, "A novel framework for noise robust ASR using cochlear implant-like spectrally reduced speech", Speech Communication, vol. 54, no. 1, pp. 119-133, January 2012. [DOI]
Cong-Thanh Do, Dominique Pastor, Andre Goalic, "Corrélation entre les différences entre les taux de reconnaissance de la parole sur deux ensembles de test et celles des distributions de probabilité des vecteurs acoustiques de ces mêmes ensembles", in Proc. XXVIIIèmes Journées d'Etude sur la Parole (JEP'2010), pp. 49-52, Mons, Belgium, May 25-28, 2010. [Paper]
Cong-Thanh Do, Dominique Pastor, Andre Goalic, "On the recognition of cochlear implant-like spectrally reduced speech with MFCC and HMM-based ASR", IEEE Transactions on Speech and Audio Processing, vol. 18, no. 5, pp. 1065-1068, September 2009. [DOI]
Cong-Thanh Do, Dominique Pastor, Andre Goalic, "On normalized MSE analysis of speech fundamental frequency in the cochlear implant-like spectrally reduced speech", IEEE Transactions on Biomedical Engineering, vol. 57, no. 3, pp. 572-577, September 2009. [DOI]
Cong-Thanh Do, Dominique Pastor, Gael Le Lan, Andre Goalic, "Recognizing cochlear implant-like spectrally reduced speech with HMM-based ASR: experiments with MFCCs and PLP coefficients", in Proc. INTERSPEECH, pp. 2634-2637, Makuhari, Chiba, Japan, September 26-30, 2010. [DOI]
Cong-Thanh Do, Abdeldjalil Aissa-El-Bey, Dominique Pastor, Andre Goalic, "Area of mouth opening estimation from speech acoustics using blind deconvolution technique", in Proc. Auditory-Visual Speech Processing (AVSP), pp. 80-85, Norwich, UK, September 10-13, 2009. [DOI]
Cong-Thanh Do, Abdeldjalil Aissa-El-Bey, Dominique Pastor, Andre Goalic, "Estimation de l'aire d'ouverture de la bouche à partir d'information acoustique de signal de parole en utilisant des techniques de déconvolution aveugle", in Proc. XXIIe colloque GRETSI (traitement du signal et des images), pp. 8-11, Dijon, France, September 2009. [Paper]
Cong-Thanh Do, Mantha Vijay Kumar, Dominique Pastor, Andre Goalic, "Automatic speech recognition of cochlear implant-like spectrally reduced speech", in Proc. National Conference on Communications (NCC), pp. 303-306, Guwahati, India, January 16-18, 2009. [Paper]