Tianchi Liu, Duc-Tuan Truong, Rohan Kumar Das, Kong Aik Lee, Haizhou Li, “Nes2Net: A Lightweight Nested Architecture for Foundation Model Driven Speech Anti-spoofing”, in IEEE Transactions on Information Forensics and Security, 2025. [pre-print]
Jichen Yang, Chang Huai You and Rohan Kumar Das, “β-Order Energy-Weighting Modulation on Spectral Bins for Replay Speech Detection”, in IEEE Transactions on Audio, Speech and Language Processing, vol. 33, pp. 2999-3013, 2025. [post-print]
Yang Xiao and Rohan Kumar Das, “XLSR-Mamba: A Dual-Column Bidirectional State Space Model for Spoofing Attack Detection”, in IEEE Signal Processing Letters, vol. 32, pp. 1276-1280, 2025. [pre-print] [post-print] [codes]
Ruijie Tao, Xinyuan Qian, Rohan Kumar Das, Xiaoxue Gao, Jiadong Wang and Haizhou Li, “Enhancing Real-World Active Speaker Detection with Multi-Modal Extraction Pre-Training”, in IEEE Transactions on Multimedia, vol. 27, pp. 2362-2373, 2025. [pre-print] [post-print]
Tanmay Khandelwal, Rohan Kumar Das, and Eng Siong Chng, “Sound Event Detection: A Journey Through DCASE Challenge Series”, in APSIPA Transactions on Signal and Information Processing, vol. 13, no. 1, 2024. [post-print]
Ruijie Tao, Kong Aik Lee, Rohan Kumar Das, Ville Hautamäki and Haizhou Li, “Self-Supervised Training of Speaker Encoder with Multi-Modal Diverse Positive Pairs”, in IEEE/ACM Transactions on Audio, Speech and Language Processing, vol. 31, pp. 1706-1719, 2023. [pre-print] [post-print]
Tianchi Liu, Rohan Kumar Das, Kong Aik Lee and Haizhou Li, “Neural Acoustic-Phonetic Approach for Speaker Verification with Phonetic Attention Mask”, in IEEE Signal Processing Letters, vol. 29, pp. 782-786, 2022. [post-print]
Longting Xu, Daiyu Huang, Syed Faham Ali Zaidi, Abdul Rauf and Rohan Kumar Das, “Graph Fourier Transform based Audio Zero-watermarking”, in IEEE Signal Processing Letters, vol. 28, pp. 1943-1947, 2021. [pre-print] [post-print]
Jichen Yang, Hongji Wang, Rohan Kumar Das and Yanmin Qian, “Modified Magnitude-phase Spectrum Information for Spoofing Detection”, in IEEE/ACM Transactions on Audio, Speech and Language Processing, vol. 29, pp. 1065-1078, 2021. [post-print]
Jichen Yang, Rohan Kumar Das and Haizhou Li, “Significance of Subband Features for Synthetic Speech Detection”, in IEEE Transactions on Information Forensics and Security, vol. 15, pp. 2160-2170, 2020. [pre-print] [post-print]
Jichen Yang, Rohan Kumar Das and Nina Zhou, “Extraction of Octave Spectra Information for Spoofing Attack Detection”, in IEEE/ACM Transactions on Audio, Speech and Language Processing, vol. 27, Issue 12, pp. 2373-2384, December 2019. [post-print] [codes]
Jichen Yang and Rohan Kumar Das, “Long-term High Frequency Features for Synthetic Speech Detection”, in Digital Signal Processing, Elsevier, vol. 97, February 2020. [post-print]
Hrishikesh Dutta, Rohan Kumar Das, Sukumar Nandi and S. R. M. Prasanna, “An Overview of Digital Audio Steganography”, in IETE Technical Review, vol. 37, Issue 6, pp. 632-6650, December 2020. [pre-print] [post-print]
Jichen Yang and Rohan Kumar Das, “Improving Anti-spoofing with Octave Spectrum and Short-term Spectral Statistics Information”, in Applied Acoustics, Elsevier, vol. 157, January 2020. [post-print]
Rohan Kumar Das and S. R. M. Prasanna, “Investigating Text-independent Speaker Verification Systems Under Varied Data Conditions”, in Circuits, Systems and Signal Processing, Springer, vol. 38, Issue 8, pp. 3778-3801, August 2019. [post-print]
Jichen Yang and Rohan Kumar Das, “Low Frequency Frame-wise Normalization over Constant-Q Transform for Playback Speech Detection”, in Digital Signal Processing, Elsevier, vol. 89, pp. 30-39, June 2019. [post-print]
Rohan Kumar Das, Sarfaraz Jelil and S. R. M. Prasanna, “Exploring Text-constraint Models and Source Information for Long-enrollment with Short-test Speaker Verification”, in Circuits, Systems and Signal Processing, Springer, vol. 38, Issue 4, pp. 1175-1792, April 2019. [post-print]
Rohan Kumar Das, and S. R. M. Prasanna, “Speaker Verification from Short Utterance Perspective: A Review”, in IETE Technical Review, vol. 35, Issue 6, pp. 599-617, December 2018. [pre-print] [post-print]
Rohan Kumar Das, Sarfaraz Jelil and S. R. M. Prasanna, “Multi-style Speaker Recognition Database in Practical Conditions”, International Journal of Speech Technology, Springer, vol. 21, Issue 3, pp. 409-419, September 2018. [post-print] [database]
Rohan Kumar Das, Bidisha Sharma and S. R. M. Prasanna, “Significance of Duration Modification for Speaker Verification under Mismatch Speech Tempo Condition”, International Journal of Speech Technology, Springer, vol. 21, Issue 3, pp. 401-408, September 2018. [post-print]
Rohan Kumar Das, Akhil Babu Manam and S. R. M. Prasanna, “Exploring Kernel Discriminant Analysis for Speaker Verification with Limited Test Data”, in Pattern Recognition Letters (PRL), Elsevier, vol. 98, pp. 26-31, October 2017. [pre-print] [post-print]
Rohan Kumar Das, Sarfaraz Jelil and S. R. M. Prasanna, “Development of Multi-Level Speech based Person Authentication System”, Journal of Signal Processing Systems, Springer, vol. 88, Issue 3, pp. 259-271, September 2017. [post-print]
R. Sharma, S. R. M. Prasanna, Ramesh K. Bhukya and Rohan Kumar Das, “Analysis of the Intrinsic Mode Functions for Speaker Information”, Speech Communication, Elsevier, vol. 91, pp. 1-16, July 2017. [post-print]
Rohan Kumar Das and S. R. M. Prasanna, “Exploring Different Attributes of Source Information for Speaker Verification with Limited Test Data” in Journal of Acoustic Society of America (JASA), vol. 140, no. 1, pp. 184-190, July 2016. [post-print]
Debmalya Chakrabarty, S. R. M. Prasanna and Rohan Kumar Das, “Development and Evaluation of Online Text-independent Speaker Verification System for Remote Person Authentication”, International Journal of Speech Technology, Springer, vol. 16, Issue 1, pp. 75-88, March 2013. [post-print]
Haris B. C., Gayadhar Pradhan, Abhinav Misra, S. R. M. Prasanna, Rohan Kumar Das, and Rohit Sinha, “Multivariability Speaker Recognition Database in Indian Scenario”, International Journal of Speech Technology, Springer, vol. 15, pp. 441-453, December 2012. [pre-print][post-print] [database]
Yang Xiao, Ting Dang and Rohan Kumar Das, “RawTFNet: A Lightweight CNN Architecture for Speech Anti-spoofing”, in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC) 2025, Singapore, October 2025. [pre-print]
Yang Xiao, Han Yin, Jisheng Bai and Rohan Kumar Das, “DG-SED: Domain Generalization for Sound Event Detection with Heterogeneous Training Data”, in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC) 2025, Singapore, October 2025. [pre-print]
Liping Chen, Kong Aik Lee, Zhen-Hua Ling, Xin Wang, Rohan Kumar Das, Tomoki Toda and Haizhou Li, “Speaker Privacy and Security in the Big Data Era: Protection and Defense against Deepfake”, in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC) 2025, Singapore, October 2025. [pre-print]
Xi Xuan, Yang Xiao, Rohan Kumar Das and Tomi Kinnunen, “Multilingual Source Tracing of Speech Deepfakes: A First Benchmark”, in Proc. 5th Symposium on Security and Privacy in Speech Communication, Netherlands, August 2025, pp. 27-34. [pre-print] [post-print]
Yang Xiao and Rohan Kumar Das, “Listen, Analyze, and Adapt to Learn New Attacks: An Exemplar-Free Class Incremental Learning Method for Audio Deepfake Source Tracing”, in Proc. Interspeech 2025, Rotterdam, Netherlands, August 2025, pp. 1563-1567. [pre-print] [post-print]
Han Yin, Yang Xiao, Rohan Kumar Das, Haohe Liu, Jisheng Bai, Wenwu Wang, Mark D Plumbley, “EnvSDD: Benchmarking Environmental Sound Deepfake Detection”, in Proc. Interspeech 2025, Rotterdam, Netherlands, August 2025, pp. 201-205. [pre-print] [post-print] [project page] [codes]
Yang Xiao and Rohan Kumar Das, “TF-Mamba: A Time-Frequency Network for Sound Source Localization”, in Proc. Interspeech 2025, Rotterdam, Netherlands, August 2025, pp. 948-952. [pre-print] [post-print]
Yang Xiao, Tianyi Peng, Yanghao Zhou, and Rohan Kumar Das, “AdaKWS: Towards Robust Keyword Spotting with Test-Time Adaptation”, in Proc. Interspeech 2025, Rotterdam, Netherlands, August 2025, pp. 5408-5412. [pre-print][post-print]
Chin-Jou Li, Eunjung Yeo, Kwanghee Choi, Paula Andrea Pérez-Toro, Masao Someki, Rohan Kumar Das, Zhengjun Yue, Juan Rafael Orozco-Arroyave, Elmar Nöth, David R. Mortensen, “Towards Inclusive ASR: Investigating Voice Conversion for Dysarthric Speech Recognition in Low-Resource Languages”, in Proc. Interspeech 2025, Rotterdam, Netherlands, August 2025, pp. 2128-2132. [pre-print] [post-print]
Yang Xiao, Tianyi Peng, Rohan Kumar Das, Yuchen Hu and Huiping Zhuang “AnalyticKWS: Towards Exemplar-Free Analytic Class Incremental Learning for Small-footprint Keyword Spotting”, in Proc. Findings of the Association for Computational Linguistics: ACL 2025, Vienna, Austria, July 2025, pp. 14147–14158. [pre-print] [post-print]
Yang Xiao and Rohan Kumar Das, “Where’s That Voice Coming? Continual Learning for Sound Source Localization”, in Proc. IEEE International Conference on Multimedia & Expo (ICME) 2025, Nantes, France, June 2025. [pre-print]
Yang Xiao and Rohan Kumar Das, “UCIL: An Unsupervised Class Incremental Learning Approach for Sound Event Detection”, in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2025, Hyderabad, India, April 2025. [pre-print] [post-print]
Han Yin, Jisheng Bai, Yang Xiao, Hui Wang, Siqi Zheng, Yafeng Chen, Rohan Kumar Das, Chong Deng, Jianfeng Chen, “Exploring Text-Queried Sound Event Detection with Audio Source Separation”, in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2025, Hyderabad, India, April 2025. [pre-print] [post-print] [codes]
Fuyuan Feng, Longting Xu and Rohan Kumar Das, “Multi-modal Speech Enhancement with Limited Electromyography Channels”, in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2025, Hyderabad, India, April 2025. [pre-print] [post-print]
Han Yin, Yang Xiao, Jisheng Bai, and Rohan Kumar Das, “Leveraging LLM and Text-Queried Separation for Noise-Robust Sound Event Detection”, in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2025 Satellite Workshop on SALMA: Speech and Audio Language Models - Architectures, Data Sources, and Training Paradigms, Hyderabad, India, April 2025. [pre-print] [codes]
Yang Xiao and Rohan Kumar Das, “WildDESED: An LLM-Powered Dataset for Wild Domestic Environment Sound Event Detection System”, in Detection and Classification of Acoustic Scenes and Events (DCASE) Challenge Workshop, October 2024, pp. 196-200. [pre-print] [post-print] [database] [project]
Muhammad Saad Saeed, Shah Nawaz, Marta Moscati, Rohan Kumar Das, Muhammad Salman Tahir, Muhammad Zaigham Zaheer, Muhammad Irzam Liaqat, Muhammad Haris Khan, Karthik Nandakumar, Muhammad Haroon Yousaf, Markus Schedl, “A Synopsis of FAME 2024 Challenge: Associating Faces with Voices in Multilingual Environments”, in Proc. ACM International Conference on Multimedia 2024, Melbourne, Australia, October 2024, pp. 11333 - 11334. [post-print]
Tianchi Liu, Lin Zhang, Rohan Kumar Das, Yi Ma, Ruijie Tao, Haizhou Li, “How Do Neural Spoofing Countermeasures Detect Partially Spoofed Audio?”, Interspeech 2024, Kos Island, Greece, September 2024, pp. 1105-1109. [pre-print] [post-print]
Yang Xiao, Han Yin, Jisheng Bai and Rohan Kumar Das, “FMSG-JLESS Submission for DCASE 2024 Task4 on Sound Event Detection with Heterogeneous Training Dataset and Potentially Missing Labels”, in DCASE 2024 Challenge, Tech. Rep., July 2024. [post-print]
Mingrui He, Longting Xu, Han Wang, Mingjun Zhang and Rohan Kumar Das, “Device Feature based on Graph Fourier Transformation with Logarithmic Processing For Detection of Replay Speech Attacks”, in Proc. The Speaker and Language Recognition Workshop (Odyssey 2024), Quebec, Canada, June 2024, pp. 137-144. [pre-print] [post-print]
Yang Xiao and Rohan Kumar Das, “Dual Knowledge Distillation for Efficient Sound Event Detection”, in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW), Seoul, South Korea, April 2024, pp. 690-694. [pre-print] [post-print]
Jichen Yang, Fangfan Chen, Rohan Kumar Das, Zhengyu Zhu and Shunsi Zhang, “Adaptive-avg-pooling based Attention Vision Transformer for Face Anti-spoofing”, in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2024, Seoul, South Korea, April 2024, pp. 3875-3879. [pre-print] [post-print]
Tanmay Khandelwal and Rohan Kumar Das, “Dynamic Thresholding on FixMatch with Weak and Strong Data Augmentations for Sound Event Detection”, in Proc. International Symposium on Chinese Spoken Language Processing (ISCSLP) 2022, Singapore, December 2022, pp. 428-432. [post-print] [presentation video]
Rohith Mars and Rohan Kumar Das, “On the Use of Absolute Threshold of Hearing-based Loss for Full-band Speech Enhancement”, in Proc. International Symposium on Chinese Spoken Language Processing (ISCSLP) 2022, Singapore, December 2022, pp. 458-462. [post-print]
Tanmay Khandelwal, Rohan Kumar Das, and Eng Siong Chng, “Is Your Baby Fine at Home? Baby Cry Sound Detection in Domestic Environments”, in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Chiang Mai, Thailand, November 2022, pp. 275-280. [post-print] [database]
Rohith Mars and Rohan Kumar Das, “A Device Classification-aided Multi-task Framework for Low-complexity Acoustic Scene Classification”, in Proc. Detection and Classification of Acoustic Scenes and Events (DCASE) Challenge Workshop, November 2022. [post-print]
Tanmay Khandelwal, Rohan Kumar Das, Andrew Koh and Eng Siong Chng, “FMSG-NTU Submission for DCASE 2022 Task 4 on Sound Event Detection in Domestic Environments”, in DCASE 2022 Challenge, Tech. Rep., June 2022. [post-print]
Longting Xu, Mianxin Tian, Xing Guo, Zhiyong Shan, Jie Jia, Yiyuan Peng, Jichen Yang and Rohan Kumar Das, “A Novel Feature Based on Graph Signal Processing for Detection of Physical Access Attacks”, in Proc. The Speaker and Language Recognition Workshop (Odyssey 2022), Beijing, China, June 2022, pp. 107-111. [post-print] [codes]
Teck Kai Chan and Rohan Kumar Das, “Cross-stitch Network with Adaptive Loss Weightage for Sound Event Localization and Detection”, in Proc. L3DAS22: Machine Learning for 3D Audio Signal Processing, May 2022, pp. 11-15. [post-print]
Tianchi Liu, Rohan Kumar Das, Kong Aik Lee and Haizhou Li, “MFA: TDNN with Multi-scale Frequency-channel Attention for Text-independent Speaker Verification with Short Utterances”, in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2022, Singapore, May 2022, pp. 7517-7521. [pre-print][post-print]
Ruijie Tao, Kong Aik Lee, Rohan Kumar Das, Ville Hautamäki and Haizhou Li, “Self-supervised Speaker Recognition with Loss-gated Learning”, in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2022, Singapore, May 2022, pp. 6142-6146. [pre-print] [post-print]
Rohan Kumar Das, Ruijie Tao and Haizhou Li, “HLT-NUS Submission for 2020 NIST Conversational Telephone Speech SRE”, in NIST SRE Workshop 2021, December 2021. [pre-print] [recipe]
Kong Aik Lee, Tomi Kinnunen, Daniele Colibro, Claudio Vair, Andreas Nautsch, Hanwu Sun, Liang He, Tianyu Liang, Qiongqiong Wang, Mickael Rouvier, Pierre-Michel Bousquet, Rohan Kumar Das, Ignacio Vinals Bailo, Meng Liu, Héctor Deldago, Xuechen Liu, Md Sahidullah, Sandro Cumani, Boning Zhang, Koji Okabe, Hitoshi Yamamoto, Ruijie Tao, Haizhou Li, Alfonso Ortega Giménez, Longbiao Wang, Luis Buera, “I4U System Description for NIST SRE’20 CTS Challenge”, in NIST SRE Workshop 2021, December 2021. [post-print]
Protima Nomo Sudro, Rohan Kumar Das, Rohit Sinha and S. R. M. Prasanna, “Significance of Data Augmentation for Improving Cleft Lip and Palate Speech Recognition”, in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC) 2021, Tokyo, Japan, December 2021, pp. 484-490. [pre-print] [post-print]
Rohan Kumar Das, “Known-unknown Data Augmentation Strategies for Detection of Logical Access, Physical Access and Speech Deepfake Attacks: ASVspoof 2021”, in Proc. 2021 Edition of the Automatic Speaker Verification and Spoofing Countermeasures Challenge Workshop, September 2021, pp. 29-36. [post-print]
Ruijie Tao, Zexu Pan, Rohan Kumar Das, Xinyuan Qian, Mike Zheng Shou and Haizhou Li, “Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection”, in Proc. ACM Multimedia 2021, Chengdu, China, October 2021, pp. 3927-3935. [pre-print] [post-print] [recipe]
Ruijie Tao, Zexu Pan, Rohan Kumar Das, Xinyuan Qian, Mike Zheng Shou and Haizhou Li, “NUS-HLT Report for ActivityNet Challenge 2021 AVA (Speaker)”, in International Challenge on Activity Recognition (ActivityNet) Workshop, CVPR 2021, June 2021. [post-print]
Rohan Kumar Das, Maulik Madhavi and Haizhou Li, “Diagnosis of COVID-19 using Auditory Acoustic Cues”, in Proc. Interspeech 2021, Brno, Czech Republic, August 2021, pp. 921-925. [post-print]
Rohan Kumar Das, Jichen Yang and Haizhou Li, “Data Augmentation with Signal Companding for Detection of Logical Access Attacks” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2021, Toronto, Ontario, Canada, June 2021, pp. 6349-6353. [pre-print] [post-print]
Protima Nomo Sudro, Rohan Kumar Das, Rohit Sinha and S. R. M. Prasanna, “Enhancing the Intelligibility of Cleft Lip and Palate Speech using Cycle-consistent Adversarial Networks” in Proc. IEEE Spoken Language Technology (SLT) 2021, Shenzhen, China, January 2021, pp. 720-727. [pre-print] [post-print]
Meidan Ouyang, Rohan Kumar Das, Jichen Yang and Haizhou Li, “Capsule Network based End-to-end System for Detection of Replay Attacks”, in Proc. International Symposium on Chinese Spoken Language Processing (ISCSLP) 2021, Hong Kong, January 2021, pp. 1-5. [post-print]
Rohan Kumar Das, Ruijie Tao, Jichen Yang, Wei Rao, Cheng Yu and Haizhou Li, “HLT-NUS Submission for NIST 2019 Multimedia Speaker Recognition Evaluation”, in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC) 2020, Auckland, New Zealand, December 2020, pp. 605-609. [pre-print] [post-print]
Rohan Kumar Das and Haizhou Li, “Classification of Speech with and without Face Mask using Acoustic Features” in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC) 2020, Auckland, New Zealand, December 2020, pp. 747-752. [pre-print] [post-print]
Biswajit Dev Sarma and Rohan Kumar Das, “Emotion Invariant Speaker Embeddings for Speaker Identification with Emotional Speech” in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC) 2020, Auckland, New Zealand, December 2020, pp. 610-615. [pre-print] [post-print]
Wanqiu Lin, Maulik Madhavi, Rohan Kumar Das and Haizhou Li, “Transformer-based Arabic Dialect Identification,” in Proc. International Conference on Asian Language Processing (IALP) 2020, Kuala Lumpur, Malaysia, December 2020, pp. 192-196. [pre-print] [recipe] [post-print]
Rohan Kumar Das, Xiaohai Tian, Tomi Kinnunen and Haizhou Li, “The Attacker's Perspective on Automatic Speaker Verification: An Overview” in Proc. Interspeech 2020, Shanghai, China, October 2020, pp. 4213-4217. [pre-print] [post-print]
Zhenzong Wu, Rohan Kumar Das, Jichen Yang and Haizhou Li, “Light Convolutional Neural Network with Feature Genuinization for Detection of Synthetic Speech Attacks” in Proc. Interspeech 2020, Shanghai, China, October 2020, pp. 1101-1105. [pre-print] [post-print]
Ruijie Tao, Rohan Kumar Das and Haizhou Li, “Audio-visual Speaker Recognition with a Cross-modal Discriminative Network” in Proc. Interspeech 2020, Shanghai, China, October 2020, pp. 2242-2246. [pre-print] [post-print]
Tianchi Liu, Rohan Kumar Das, Maulik Madhavi, Shengmei Shen and Haizhou Li, “Speaker-Utterance Dual Attention for Speaker and Utterance Verification” in Proc. Interspeech 2020, Shanghai, China, October 2020, pp. 4293-4297. [pre-print] [post-print]
Xiaoyi Qin, Ming Li, Hui Bu, Wei Rao, Rohan Kumar Das, Shrikanth Narayanan, Haizhou Li, “The INTERSPEECH 2020 Far-Field Speaker Verification Challenge” in Proc. Interspeech 2020, Shanghai, China, October 2020, pp. 3456-3460. [pre-print] [post-print][database]
Zhao Yi, Wen-Chin Huang, Xiaohai Tian, Junichi Yamagishi, Rohan Kumar Das, Tomi Kinnunen, Zhenhua Ling and Tomoki Toda, “Voice Conversion Challenge 2020 – Intra-lingual Semiparallel and Cross-lingual Voice Conversion –” in Proc. ISCA Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020, October 2020, pp. 80-98. [pre-print] [post-print] [database]
Rohan Kumar Das, Tomi Kinnunen, Wen-Chin Huang, Zhenhua Ling, Junichi Yamagishi, Zhao Yi, Xiaohai Tian and Tomoki Toda, “Predictions of Subjective Ratings and Spoofing Assessments of Voice Conversion Challenge 2020 Submissions” in Proc. ISCA Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020, October 2020, pp. 99-120. [pre-print] [post-print]
Xiaohai Tian, Rohan Kumar Das and Haizhou Li, “Black-box Attacks on Automatic Speaker Verification using Feedback-controlled Voice Conversion” in Proc. The Speaker and Language Recognition Workshop (Odyssey 2020), Tokyo, Japan, November 2020, pp. 159-164. [pre-print] [post-print]
Xiaoxue Gao, Xiaohai Tian, Yi Zhou, Rohan Kumar Das and Haizhou Li, “Personalized Singing Voice Generation Using WaveRNN” in Proc. The Speaker and Language Recognition Workshop (Odyssey 2020), Tokyo, Japan, November 2020, pp. 252-258. [samples] [post-print]
Rohan Kumar Das and Haizhou Li, “On the Importance of Vocal Tract Constriction for Speaker Characterization: The Whispered Speech Study” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2020, Barcelona, Spain, May 2020, pp. 7119-7123. [pre-print] [post-print]
Rohan Kumar Das, Jichen Yang and Haizhou Li, “Assessing the Scope of Generalized Countermeasures for Anti-spoofing” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2020, Barcelona, Spain, May 2020, pp. 6589-6593. [pre-print] [post-print]
Xuehao Zhou, Xiaohai Tian, Grandee Lee, Rohan Kumar Das and Haizhou Li, “End-to-end Code-switching TTS with Cross-lingual Language Model” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2020, Barcelona, Spain, May 2020, pp. 7614-7618. [post-print] [samples]
Rohan Kumar Das, Jichen Yang and Haizhou Li, “Long Range Acoustic and Deep Features Perspective on ASVspoof 2019” in Proc. IEEE Automatic Speech Recognition Understanding (ASRU) Workshop 2019, Sentosa Island, Singapore, December 2019, pp. 1018-1025. [pre-print] [post-print]
Yi Zhou, Xiaohai Tian, Emre Yılmaz, Rohan Kumar Das and Haizhou Li, “A Modularized Neural Network with Language-specific Output Layers for Cross-lingual Voice Conversion” in Proc. IEEE Automatic Speech Recognition Understanding (ASRU) Workshop 2019, Sentosa Island, Singapore, December 2019, pp. 160-167. [pre-print] [post-print] [samples]
Rohan Kumar Das, Jichen Yang and Haizhou Li, “Speaker Clustering with Penalty Distance for Speaker Verification with Multi-Speaker Speech” in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC) 2019, Lanzhou, China, November 2019, pp. 1630-1635. [pre-print] [post-print]
Yitong Liu, Rohan Kumar Das and Haizhou Li, “Multi-band Spectral Entropy Information for Detection of Replay Attacks” in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC) 2019, Lanzhou, China, November 2019, pp. 838-843. [pre-print] [post-print]
Yi Zhou, Xiaohai Tian, Rohan Kumar Das and Haizhou Li, “Many-to-many Cross-lingual Voice Conversion with a Jointly Trained Speaker Embedding Network” in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC) 2019, Lanzhou, China, November 2019, pp. 1282-1287. [pre-print] [post-print] [samples]
Xiaoxue Gao, Xiaohai Tian, Rohan Kumar Das, Yi Zhou and Haizhou Li, “Speaker-independent Spectral Mapping for Speech-to-Singing Conversion” in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC) 2019, Lanzhou, China, November 2019, pp. 159-164. [pre-print] [post-print] [samples]
Rohan Sheelvant, Bidisha Sharma, Maulik Madhavi, Rohan Kumar Das, S. R. M. Prasanna and Haizhou Li, “RSL2019: A Realistic Speech Localization Corpus” in Proc. Oriental COCOSDA 2019, Cebu City, Philippines, October 2019, pp. 1-6. [pre-print] [post-print] [database]
Rohan Kumar Das and Haizhou Li, “Instantaneous Phase and Long-term Acoustic Cues for Orca Activity Detection” in Proc. Interspeech 2019, Graz, Austria, September 2019, pp. 2418-2422. [post-print]
Rohan Kumar Das, Jichen Yang and Haizhou Li, “Long Range Acoustic Features for Spoofed Speech Detection” in Proc. Interspeech 2019, Graz, Austria, September 2019, pp. 1058-1062. [post-print]
Bidisha Sharma, Rohan Kumar Das and Haizhou Li, “On the Importance of Audio-source Separation for Singer Identification in Polyphonic Music” in Proc. Interspeech 2019, Graz, Austria, September 2019, pp. 2020-2024. [post-print]
Bidisha Sharma, Rohan Kumar Das and Haizhou Li, “Multi-level Adaptive Speech Activity Detector for Speech in Naturalistic Environments” in Proc. Interspeech 2019, Graz, Austria, September 2019, pp. 2015-2019. [post-print] [codes]
Tianchi Liu, Maulik Madhavi, Rohan Kumar Das and Haizhou Li, “A Unified Framework for Speaker and Utterance Verification” in Proc. Interspeech 2019, Graz, Austria, September 2019, pp. 4320-4324. [post-print] [recipe]
Kong Aik Lee, Ville Haütamaki, Tomi Kinnunen, Hitoshi Yamamoto, Koji Okabe, Ville Vestman, Jing Huang, Guohong Ding, Hanwu Sun, Anthony Larcher, Rohan Kumar Das, Haizhou Li, Mickael Rouvier, Pierre-Michel Bousquet, Wei Rao, Qing Wang, Chunlei Zhang, Fahimeh Bahmaninezhad, Héctor Delgado and Massimiliano Todisco, “I4U Submission to NIST SRE 2018: Leveraging from a Decade of Shared Experiences” in Proc. Interspeech 2019, Graz, Austria, September 2019, pp. 1497-1501. [post-print]
Jibin Wu, Zihan Pan, Malu Zhang, Rohan Kumar Das, Yansong Chua and Haizhou Li, “Robust Sound Recognition: A Neuromorphic Approach” in Proc. Interspeech 2019, Graz, Austria, September 2019, pp. 3667-3668. [post-print] [demo video]
Sarfaraz Jelil, Abhishek Shrivastava, Rohan Kumar Das, S. R. M. Prasanna and Rohit Sinha, “SpeechMarker: A Voice based Multi-level Attendance Application” in Proc. Interspeech 2019, Graz, Austria, September 2019, pp. 3665-3666. [post-print] [demo video]
Yi Zhou, Xiaohai Tian, Haihua Xu, Rohan Kumar Das and Haizhou Li, “Cross-Lingual Voice Conversion with Bilingual Phonetic Posteriorgram and Average Modeling” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2019, Brighton, United Kingdom, May 2019, pp. 6790-6794. [pre-print] [post-print] [samples]
Longting Xu, Rohan Kumar Das, Emre Yılmaz, Jichen Yang and Haizhou Li, “Generative x-vectors for Text-independent Speaker Verification” in Proc. IEEE Spoken Language Technology (SLT) 2018, Athens, Greece, December 2018, pp. 1014-1020. [pre-print] [post-print]
Rohan Kumar Das and Haizhou Li, “Instantaneous Phase and Excitation Source Features for Detection of Replay Attacks” in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC) 2018, Honolulu, Hawaii, USA, November 2018, pp. 1030-1037. [pre-print] [post-print]
Rohan Kumar Das, Maulik Madhavi and Haizhou Li, “Compensating Utterance Information in Fixed Phrase Speaker Verification” in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC) 2018, Honolulu, Hawaii, USA, November 2018, pp. 1708-1712. [pre-print] [post-print]
Jichen Yang, Rohan Kumar Das and Haizhou Li, “Extended Constant-Q Cepstral Coefficients for Detection of Spoofing Attacks” in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC) 2018, Honolulu, Hawaii, USA, November 2018, pp. 1024-1029. [pre-print] [post-print]
Rohan Kumar Das and S. R. M. Prasanna, “Investigating Text-independent Speaker Verification from Practically Realizable System Perspective” in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC) 2018, Honolulu, Hawaii, USA, November 2018, pp. 1483-1487. [pre-print] [post-print]
Kantheti Srinivas, Rohan Kumar Das and Hemant A. Patil, “Combining Phase-based Features for Replay Spoof Detection” in Proc. International Symposium on Chinese Spoken Language Processing (ISCSLP) 2018, Taipei, Taiwan, November 2018, pp. 151-155. [post-print]
Xiaoxue Gao, Berrak Sisman, Rohan Kumar Das and Karthika Vijayan, “NUS-HLT Spoken Lyrics and Singing (SLS) Corpus” in Proc. International on Orange Technologies (ICOT) 2018, Bali, Indonesia, October 2018, pp. 1-6. [pre-print][post-print]
Biswajit Dev Sarma, Rohan Kumar Das, Abhishek Dey and Risto Haukioja, “Analysis of Speech Emotions in Realistic Environments” in Proc. Speech, Music and Mind (SMM) 2018, a satellite event of Interspeech 2018, Hyderabad, India, September 2018, pp. 11-15. [post-print]
Kuruvachan K. George, Rohan Kumar Das, Sarfaraz Jelil, K. Arun Das, C. Santhosh Kumar, S. R. M. Prasanna and Ashish Panda, “AMRITATCS-IITGUWAHATI Combined System for the Speakers in the Wild (SITW) Speaker Recognition Challenge” in Proc. IEEE TENCON 2016, Singapore, November 2016, pp. 2842-2846. [post-print] [slides]
Akhil Babu Manam, Tummala Sai Revanth, Rohan Kumar Das and S. R. M. Prasanna, “Speaker Verification using Acoustic Factor Analysis with Phonetic Content Compensation in Limited and Degraded Test Conditions” in Proc. IEEE TENCON 2016, Singapore, November 2016, pp. 1402-1406. [post-print]
Salil Mamodiya, Lav Kumar, Rohan Kumar Das and S. R. M. Prasanna, “Exploring Acoustic Factor Analysis for Limited Test Data Speaker Verification” in Proc. IEEE TENCON 2016, Singapore, November 2016, pp. 1397-1401. [post-print]
Rohan Kumar Das and S. R. M. Prasanna, “Text-independent Speaker Verification with Limited Test Data from the Perspective of Practical Systems” in Proc. 2nd Doctoral Consortium, Interspeech 2016, ICSI, Berkeley, California, September 2016. [post-print]
Rohan Kumar Das, Sarfaraz Jelil and S. R. M. Prasanna, “Exploring Session Variability and Template Aging in Speaker Verification for Fixed Phrase Short Utterances” in Proc. Interspeech 2016, San Francisco, September 2016, pp. 445-449. [post-print]
Rohan Kumar Das, Sarfaraz Jelil and S. R. M. Prasanna, “Significance of Constraining Text in Limited Data Text-independent Speaker Verification” in Proc. International Conference on Signal Processing and Communications (SPCOM) 2016, IISc Bangalore, June 2016, pp. 1-5. [post-print]
Anupama Paul, Rohan Kumar Das, Rohit Sinha and S. R. M. Prasanna, “Countermeasure to Handle Record and Replay Attacks in Practical Speaker Verification Systems” in Proc. International Conference on Signal Processing and Communications (SPCOM) 2016, Bangalore, June 2016, pp. 1-5. [post-print]
Deepshikha Mahanta, Anupama Paul, Ramesh K. Bhukya, Rohan Kumar Das, Rohit Sinha and S. R. M. Prasanna, “Warping Path and Gross Spectrum Information for Speaker Verification under Degraded Condition” in Proc. 22nd National Conference on Communications (NCC) 2016, IIT Guwahati, March 2016, pp. 1-6. [post-print]