Publications

A complete list of publications is available from Google Scholar.

2022

  • J. Y. Lee, K. A. Lee, and W. S. Gan, “DLVGen: A dual latent variable approach to personalized dialogue generation,” in Proc. International Conference on Agents and Artificial Intelligence (ICAART), 2022, vol. 2, pp. 193-202.

  • T. Liu, R. K. Das, K. A. Lee, and H. Li, “Neural acoustic-phonetic approach for speaker verification with phonetic attention mask,” IEEE Signal Processing Letters, vol. 29, pp. 782-786, 2022.

  • J. Y. Lee, K. A. Lee, and W. S. Gan, “Improving contextual coherence in variational personalized and empathetic dialogue agents,” in Proc. IEEE ICASSP, 2022, pp. 7052-7056.

  • R. Tao, K. A. Lee, R. K. Das, V. Hautamaki, and H. Li, “Self-supervised speaker recognition with loss-gated learning,” in Proc. IEEE ICASSP, 2022, pp. 6142-6146.

  • H. Zhang, L. Wang, K. A. Lee, M. Liu, J. Dang, and H. Chen, “Learning domain-invariant transformation for speaker verification,” in Proc. IEEE ICASSP, 2022, pp. 7177-7181.

  • T. Liu, R. K. Das, K. A. Lee, H. Li, “MFA: TDNN with multi-scale frequency-channel attention for text-independent speaker verification with short utterances,” in Proc. IEEE ICASSP, 2022, pp. 7517-7521.

  • F. Yu, S. Zhang, P. Guo, Y. Fu, Z. Du, S. Zheng, W. Huang, L. Xie, Z.-H. Tan, D. Wang, Y. Qian, K. A. Lee, Z. Yan, B. Ma, X. Xu, and H. Bu, “Summary on the ICASSP 2022 multi-channel multi-party meeting transcription grand challenge,” in Proc. IEEE ICASSP, 2022, pp. 9156-9160.

  • H. Shim, H. Tak, X. Liu, H. Heo, J. Jung, J. Chung, S. Chung, H. Yu, B. Lee, M. Todisco, H. Delgado, K. A. Lee, M. Sahidullah, T. Kinnunen, and N. Evans, “Baseline systems for the first spoofing-aware speaker verification challenge: score and embedding fusion,” in Proc. Odyssey Workshop, 2022, pp. 330 – 337.

2021

  • M. Liu, L. Wang, J. Dang, K. A. Lee, and S. Nakagawa, “Replay attack detection using variable-frequency resolution phase and magnitude features,” Computer Speech & Language, vol. 66, 101161, Mar. 2021.

  • A. Nautsch, X. Wang, N. Evans, T. H. Kinnunen, V. Vestman, M. Todisco, H. Delgado, M. Sahidullah, J. Yamagishi, and K. A. Lee, "ASVspoof 2019: Spoofing Countermeasures for the Detection of Synthesized, Converted and Replayed Speech," in IEEE Transactions on Biometrics, Behavior, and Identity Science, vol. 3, no. 2, pp. 252-265, Apr. 2021.

  • K. A. Lee, V. Vestman, and T. Kinnunen, "ASVtorch toolkit: Speaker verification with deep neural networks," SoftwareX, vol. 14, 2021, 100697, ISSN 2352-7110.

  • K. A. Lee, Q. Wang, and T. Koshinaka, “Xi-vector embedding for speaker recognition,” IEEE Signal Processing Letters, vol. 28, pp. 1385-1389, June 2021.

  • M. Liu, L. Wang, K. A. Lee, X. Chen, and J. Dang, “Replay-attack detection using features with adaptive spectro-temporal resolution,” in Proc. ICASSP, 2021, pp. 6374-6378.

  • H. Zhang, L. Wang, K. A. Lee, M. Liu, J. Dang, and H. Chen, “Meta-learning for cross-channel speaker verification,” in Proc. ICASSP, 2021, pp. 5839-5843.

  • L. Li, K. Hu, Y. Zheng, J. Liu, and K. A. Lee, “COOPNet: Multi-Modal Cooperative Gender Prediction in Social Media User Profiling,” in Proc. ICASSP, 2021, pp. 4310-4314.

  • H. Zhu, K. A. Lee, and H. Li, “Serialized multi-layer multi-head attention for neural speaker embedding,” in Proc. INTERSPEECH, 2021, pp. 106-110.

  • Y. Wu, L. Wang, K. A. Lee, M. Liu, and J. Dang, “Joint feature enhancement and speaker recognition with multi-objective task-oriented network,” in Proc. INTERSPEECH, 2021, pp. 1089-1093.

  • L. Zhang, Q. Wang, K. A. Lee, L. Xie, and H. Li, “Multi-level transfer learning from near-field to far-field speaker verification,” in Proc. INTERSPEECH, pp. 1094-1098, 2021.

  • J. Y. Lee, K. A. Lee, and W. S. Gan, “Generating personalized dialogue via multi-task meta-learning,” SemDial, 2021.

  • Q. Wang, K. A. Lee, T. Koshinaka, K. Okabe, and H. Yamamoto, "Task-aware Warping Factors in Mask-based Speech Enhancement," in Proc. European Signal Processing Conference (EUSIPCO), 2021, pp. 476-480.

  • M. Liu, L. Wang, K. A. Lee, H. Zhang, C. Zeng, and J. Dang, “DeepLip: A benchmark for deep learning-based audio-visual lip biometrics,” in Proc. IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 2021, pp. 122-129.

  • Y. Ma, K. A. Lee, V. Hautamäki and H. Li, “PL-EESR: Perceptual loss based end-to-end robust speaker representation extraction,” in Proc. IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 2021, pp. 106-113.

2020

  • K. A. Lee, O. Sadjadi, H. Li, and D. Reynolds, “Two decades into Speaker Recognition Evaluation - are we there yet?Computer Speech & Language, vol. 61, 101058, 2020.

  • A. Sholokhov, T. Kinnunen, V. Vestman, and K. A. Lee, "Voice biometrics security: Extrapolating false alarm rate via hierarchical Bayesian modeling of speaker verification scores,” Computer Speech & Language, vol. 60, 101024, 2020.

  • K. A. Lee, H. Yamamoto, K. Okabe, Q. Wang, L. Guo, T. Koshinaka, J. Zhang, and K. Shinoda, “NEC-TT System for Mixed-Bandwidth and Multi-Domain Speaker Recognition,” Computer Speech & Language, vol. 61, 101033, May 2020.

  • X. Wang, J. Yamagishi, M. Todisco, H. Delgado, A. Nautsch, N. Evans, M. Sahidullah, V. Vestman, T. Kinnunen, K. A. Lee, L. Juvela et al, “ASVspoof 2019: a large-scale public database of synthetized, converted and replayed speech,” Computer Speech & Language, vol. 64, 101114, 2020.

  • I. Kukanov, T. N. Trong, V. Hautamäki, S. M. Siniscalchi, V. M. Salerno, and K. A. Lee, “Maximal Figure-of-Merit Framework to Detect Multi-Label Phonetic Features for Spoken Language Recognition,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 28, pp. 682-695, 2020.

  • T. Kinnunen, H. Delgado, N. Evans, K. A. Lee, V. Vestman, A. Nautsch, M. Todisco, X. Wang, M. Sahidullah, J. Yamagishi, and D. A. Reynolds, “Tandem assessment of spoofing countermeasures and automatic speaker verification: Fundamentals,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 28, pp. 2195-2210, 2020.

  • Q. Wang, K. Okabe, K. A. Lee, and T. Koshinaka, “A generalized framework for domain adaptation of PLDA in speaker recognition,” in Proc. IEEE ICASSP, 2020.

  • H. Zeinali, K. A. Lee, J. Alam, and L. Burget, “SdSV Challenge 2020: large-scale evaluation of short-duration speaker verification,” in Proc. INTERSPEECH, 2020, pp. 731-735.

  • H. Zhang, L. Wang, Y. Zhang, M. Liu, K. A. Lee, and J. Wei, “Adversarial separation network for speaker recognition,” Proc. INTERSPEECH, 2020, pp. 951-955.

  • D. Zhou, L. Wang, K. A. Lee, Y. Wu, M. Liu, J. Dang, and J. Wei, “Dynamic margin softmax loss for speaker verification,” Proc. INTERSPEECH, 2020, pp. 3800-3804.

  • K. A. Lee, H. Yamamoto, K. Okabe, Q. Wang, L. Guo, T. Koshinaka, J. Zhang, and K. Shinoda, “NEC-TT speaker verification system for SRE’19 CTS Challenge,” in Proc. INTERSPEECH, 2020, pp. 2227-2231.

  • K. Akimoto, S. P. Liew, S. Mishima, R. Mizushima, and K. A. Lee, "POCO: A voice spoofing and liveness detection corpus based on pop noise," Proc. INTERSPEECH 2020, pp. 1081-1085.

  • L. Chen, K. A. Lee, L. He, F. Soong, “On early-stop clustering for speaker diarization,” in Proc. Odyssey 2020: The Speaker and Language Recognition Workshop, 2020, pp. 110-116.

  • Q. Wang, K. A. Lee, and T. Koshinaka, “Using multi-resolution feature maps with convolutional neural networks for anti-spoofing in ASV,” in Proc. Odyssey 2020: The Speaker and Language Recognition Workshop, 2020, pp. 138-142.

  • P. Garcia Perera, J. Villalba, H. Bredin, J. Du, D. Castan, A. Cristia, L. Bullock, L. Guo, K. Okabe, P.S. Nidadavolu, S. Kataria, S. Chen, L. Galmant, M. Lavechin, L. Sun, M. Gill, B. Ben-Yair, S. Abdoli, X. Wang, W. Bouaziz, H. Titeux, E. Dupoux, K.A. Lee, and N. Dehak, "Speaker detection in the wild: Lessons learned from JSALT 2019," in Proc. Odyssey 2020 The Speaker and Language Recognition Workshop, pp. 415-422.