LiLAB - Research & Publications

LiLAB

Research & Publications at LiLAB

2026

Journal Articles

Shi X, He J, Li X, Toda T. A Comprehensive Study on the Effectiveness of ASR Representations for Noise-Robust Speech Emotion Recognition. IEEE Transactions on Audio, Speech and Language Processing. vol. 34, pp. 707-722, 2026, doi: 10.1109/TASLPRO.2026.3654273.
Shenzhi Li, Xingfeng Li, Hao Zhu, Chao Li, Peng Wang, PFDBooster: A Unified Post-Image Fusion Dual-Domain Boosting Paradigm, Knowledge-Based Systems, 2026, 115577, ISSN 0950-7051, https://doi.org/10.1016/j.knosys.2026.115577.
X. Li, N. Luo, F. Yu, X. Shi, J. Li and Y. Liu, "Multi-Task Deep Learning with Over-Sampling and Style Randomization for Improved Cross-Regional Bird Vocalization Recognition," in IEEE Transactions on Audio, Speech and Language Processing, doi: 10.1109/TASLPRO.2026.3675794.
X. Shi, X. Li and T. Toda, "Emotion Similarity and Shift: Modeling Temporal Dynamic Interactions for Emotion Prediction in Conversation," in IEEE Transactions on Audio, Speech and Language Processing, doi: 10.1109/TASLPRO.2026.3684422.

2025

Journal Articles

Xingfeng Li, Ningfeng Luo, Feifei Yu, Junjie Li, Kai Li, Yongwei Li, Zhen Zhao, Yang Liu, Xiaohan Shi, "Human Auditory Representation Learning for cross-dialect bird species recognition" in Ecological Informatics, Volume 93, February 2026, 103554. https://doi.org/10.1016/j.ecoinf.2025.103554
J. He, X. Shi, C. H. Hu, J. Mi, X. Li and T. Toda, "M4SER: Multimodal, Multirepresentation, Multitask, and Multistrategy Learning for Speech Emotion Recognition," in IEEE Transactions on Audio, Speech and Language Processing, vol. 33, pp. 4055-4070, 2025, doi: 10.1109/TASLPRO.2025.3614428.
Yang Liu, Xin Chen, Yarong Li, Jie Ma, Xiaoqi Yang, Yuan Song, Xiaolei Meng, Yongwei Li, Xingfeng Li, Zhen Zhao, "Enhanced Speech Emotion Recognition in Noisy Environments: Adaptive Emotion Denoising Diffusion Approach With Iterative Confidence Learning Strategy," in IEEE Internet of Things Journal, vol. 12, no. 20, pp. 43241-43254, 15 Oct.15, 2025, doi: 10.1109/JIOT.2025.3595096.
K. Li, K. Zaman, X. Li, M. Akagi, J. Dang, and M. Unoki, "Machine Anomalous Sound Detection Using Spectral-Temporal Modulation Representations Derived From Machine-Specific Filterbanks," in IEEE Transactions on Audio, Speech and Language Processing, vol. 33, pp. 2059-2073, 2025, doi: 10.1109/TASLPRO.2025.3570956.
Y. Liu, X. Chen, Z. Peng, Y. Li, X. Li, P. Song, M. Unoki, and Z. Zhao, "Enhancing Speech Emotion Recognition With Conditional Emotion Feature Diffusion and Progressive Interleaved Learning Strategy," in IEEE Transactions on Audio, Speech and Language Processing, vol. 33, pp. 1787-1800, 2025, doi: 10.1109/TASLPRO.2025.3561606.
Gao, Shun and Xia, Yan and Li, Xingfeng and Cui, Feifei and Zhang, Qingchen and Zou, Quan and Zhang, Zilong, "ACP-ESM2: Enhancing Anticancer Peptide Prediction With Pre-Trained Protein Language Models," in IEEE Transactions on Computational Biology and Bioinformatics, vol. 22, no. 3, pp. 1041-1051, May-June 2025, doi: 10.1109/TCBBIO.2025.3547952.

Conference Proceedings

Liu, Xiaokang, Xingfeng Li, Yudong Yang, Lan Wang, and Nan Yan. "Addressing Task Conflicts in Stuttering Detection via MMoE-Based Multi-Task Learning." In Proc. Interspeech 2025, pp. 798-802. 2025.
Shi, Xiaohan, Xingfeng Li, and Tomoki Toda. "Who, When, and What: Leveraging the “Three Ws” Concept for Emotion Recognition in Conversation." In Proc. Interspeech 2025, pp. 1763-1767. 2025.
Shi, Xiaohan, Xingfeng Li, and Tomoki Toda. "Speaker-Aware Multi-Task Learning for Speech Emotion Recognition." In Proc. Interspeech 2025, pp. 4333-4337. 2025.
Shi, Xiaohan, Jinyi Mi, Xingfeng Li, and Tomoki Toda. "Advancing emotion recognition via ensemble learning: Integrating speech, context, and text representations." In Proc. Interspeech 2025, pp. 4693-4697. 2025.
X. Li and J. Li, "Valence-Arousal Emotion Recognition Using a Deep Three-Layer Model with Aural Perceptual Representations," 2025 IEEE International Conference on Big Data (BigData), Macau, China, 2025, pp. 1964-1973, doi: 10.1109/BigData66926.2025.11401009.
X. Li and F. Yu, "Phase-Aware Spectrogram Fusion with Dual-Stream Residual Networks for Underwater Acoustic Recognition," 2025 IEEE International Conference on Big Data (BigData), Macau, China, 2025, pp. 1954-1963, doi: 10.1109/BigData66926.2025.11402377.
X. Li and J. Kang, "Musically-Inspired Colored Pitch Features for Emotion and Speaker Recognition in Speech," 2025 IEEE International Conference on Big Data (BigData), Macau, China, 2025, pp. 7493-7502, doi: 10.1109/BigData66926.2025.11401704.

2024

Journal Articles

Yidi Sun, Lingling Kong, Jiayi Huang, Hongyan Deng, Xinling Bian, Xingfeng Li, Feifei Cui, Lijun Dou, Chen Cao, Quan Zou, Zilong Zhang, A comprehensive survey of dimensionality reduction and clustering methods for single-cell and spatial transcriptomics data, Briefings in Functional Genomics, Volume 23, Issue 6, November 2024, Pages 733–744, https://doi.org/10.1093/bfgp/elae023
Xiangrun LI, Qiyu SHENG, Guangda ZHOU, Jialong WEI, Yanmin SHI, Zhen ZHAO, Yongwei LI, Xingfeng LI, Yang LIU, "Pool-Unet: A Novel Tongue Image Segmentation Method Based on Pool-Former and Multi-Task Mask Learning" in IEICE TRANSACTIONS on Fundamentals, vol. E107-A, no. 10, pp. 1609-1620, October 2024, doi: 10.1587/transfun.2024EAP1015.
Fu, Xiuhao and Duan, Hao and Zang, Xiaofeng and Liu, Chunling and Li, Xingfeng and Zhang, Qingchen and Zhang, Zilong and Zou, Quan and Cui, Feifei, "Hyb_SEnc: An Antituberculosis Peptide Predictor Based on a Hybrid Feature Vector and Stacked Ensemble Learning," in IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 21, no. 6, pp. 1897-1910, Nov.-Dec. 2024, doi: 10.1109/TCBB.2024.3425644.
Yidi Sun , Lingling Kong , Jiayi Huang , Hongyan Deng , Xinling Bian , Xingfeng Li , Feifei Cui , Lijun Dou , Chen Cao , Quan Zou , Zilong Zhang, msBERT-Promoter: a multi-scale ensemble predictor based on BERT pre-trained model for the two-stage prediction of DNA promoters and their strengths. BMC Biol 22, 126 (2024). https://doi.org/10.1186/s12915-024-01923-z
Duan H, Zhang Y, Qiu H, Fu X, Liu C, Zang X, Xu A, Wu Z, Li X, Zhang Q, Zhang Z. Machine learning-based prediction model for distant metastasis of breast cancer. Computers in Biology and Medicine. 2024 Feb 1;169:107943.

Conference Proceedings

J. Yuan, X. Li, Z. Zhang, Q. Zhang, Q. Zou and F. Cui, "RNASite: A one-stop tool website that integrates multiple RNA modification site databases and servers," 2024 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Lisbon, Portugal, 2024, pp. 2816-2820, doi: 10.1109/BIBM62325.2024.10822116.
X. Shi, Y. Gao, J. He, J. Mi, X. Li and T. Toda, "A Study on Multimodal Fusion and Layer Adapter in Emotion Recognition," 2024 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Macau, Macao, 2024, pp. 1-6, doi: 10.1109/APSIPAASC63619.2025.10848773.
Li, Xingfeng and Shi, Xiaohan and Si, Yuke and Zhang, Zilong and Cui, Feifei and Li, Yongwei and Liu, Yang and Unoki, Masashi and Akagi, Masato, "BEES: A New Acoustic Task for Blended Emotion Estimation in Speech," 2024 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Macau, Macao, 2024, pp. 1-6, doi: 10.1109/APSIPAASC63619.2025.10848842.
Shi X, Li X, Toda T. Multimodal Fusion of Music Theory-Inspired and Self-Supervised Representations for Improved Emotion Recognition. InAnnual Conference of the International Speech Communication Association 2024 (pp. 2024-2350). ISCA.
He J, Shi X, Li X, Toda T. Mf-aed-aec: Speech emotion recognition by leveraging multimodal fusion, asr error detection, and asr error correction. InICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024 Apr 14 (pp. 11066-11070). IEEE.

2023

Journal Articles

Li X, Shi X, Hu D, Li Y, Zhang Q, Wang Z, Unoki M, Akagi M. Music theory-inspired acoustic representation for speech emotion recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing. 2023 Jun 26;31:2534-47.

Conference Proceedings

Jingguang Tian, Desheng Hu, Xiaohan Shi, Jiajun He, Xingfeng Li, Yuan Gao, Tomoki Toda, Xinkang Xu, and Xinhui Hu. 2023. Semi-supervised Multimodal Emotion Recognition with Consensus Decision-making and Label Correction. In Proceedings of the 1st International Workshop on Multimodal and Responsible Affective Computing (MRAC '23). Association for Computing Machinery, New York, NY, USA, 67–73. https://doi.org/10.1145/3607865.3613182
Shi X, Li X, Toda T. Emotion awareness in multi-utterance turn for improving emotion prediction in multi-speaker conversation. InProc. Interspeech 2023 (Vol. 2023, pp. 765-769).

~2022

Journal Articles

Peng Z, Li X, Zhu Z, Unoki M, Dang J, Akagi M. Speech emotion recognition using 3d convolutions and attention-based sliding recurrent networks with auditory front-ends. IEEE Access. 2020 Jan 20;8:16560-72.
Li X, Akagi M. Improving multilingual speech emotion recognition by combining acoustic features in a three-layer model. Speech Communication. 2019 Jul 1;110:1-2.

Conference Proceedings

Li X, Guo T, Hu X, Xu X, Dang J, Akagi M. Hierarchical Prosody Analysis Improves Categorical and Dimensional Emotion Recognition. In2021 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) 2021 Dec 14 (pp. 700-704). IEEE.
Li X, Akagi M. The Contribution of Acoustic Features Analysis to Model Emotion Perceptual Process for Language Diversity. Proc. Interspeech 2019. 2019:3262-6.
Li X, Akagi M. Maximal Information Coefficient and Predominant Correlation-Based Feature Selection Toward A Three-Layer Model for Speech Emotion Recognition. In2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) 2018 Nov 12 (pp. 1428-1434). IEEE.
Li X, Akagi M. A Three-Layer Emotion Perception Model for Valence and Arousal-Based Detection from Multilingual Speech. Proc. Interspeech 2018. 2018:3643-7.
Li X, Akagi M. Multilingual Speech Emotion Recognition System Based on a Three-Layer Model. InInterspeech 2016 Sep 8 (pp. 3608-3612).
Li X, Akagi M. Automatic Speech Emotion Recognition in Chinese Using a Three-layered Model in Dimensional Approach. In2016 RISP International Workshop on Nonlinear Circuits, Communications and Signal Processing (NCSP'16) 2016 (pp. 17-20). 信号処理学会.
Li X, Akagi M. Toward improving estimation accuracy of emotion dimensions in bilingual scenario based on three-layered model. In2015 International Conference Oriental COCOSDA held jointly with 2015 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE) 2015 Oct 28 (pp. 21-26). IEEE.

Page updated

Google Sites

Report abuse