I serve as a reviewer for ICASSP, Interspeech, IJCNN, EUSIPCO, DSP, Audio Engineering Society conference, IEEE /ACM Transactions on Audio, Speech and Language Processing, IEEE signal processing letters, Neurocomputing, IEEE Journal of Selected Topics in Signal Processing, Speech communication, IEEE Access, IEEE Transactions on Emerging Topics in Computational Intelligence, IEEE Transactions on signal processing, IEEE Transactions on multimedia, IEEE Open Journal of Signal Processing, etc.
I have two ESI highly cited papers (IEEE journal). I achieved 2018 IEEE SPS Best paper award for my work on deep learning based speech enhancement. I am the IEEE Signal Processing Society - Speech and Language Technical Committee (SLTC) member (2023-2025). I am one of the World's Top 2% Scientists in 2022 ranked by Stanford University. I am an IEEE Senior Member.
Meta, USA AI Research Scientist 2024.12 – present
Work on Smart Glasses and Large Language Model
Tencent America (AI Lab), Bellevue, WA, USA Principal Research Scientist and Tech Lead 2023.12 – 2024.12
I lead a multilingual ASR team to support GPT4o-like product development based on a large language model (LLM).
Tencent America (AI Lab), Bellevue, WA, USA Principal Research Scientist 2021 – 2024
Multi-channel speech enhancement/separation/de-reverberation/speech recognition, I proposed ADL-MVDR/RNN beamformer.
Tencent America (AI Lab), Bellevue, WA, USA Senior Research Scientist 2018 – 2021
Multi-modality speech enhancement/separation/de-reverberation/speech recognition
University of Surrey, Guildford, UK Full-time Research Fellow 2016 – 2018
Deep learning (DNN/CNN/LSTM, attention, reinforcement learning, generative adversarial network, etc) based environmental sound classification and analysis.
iFLYTEK, China Research Scientist 2015– 2016
Worked on far-field speech recognition for the smart speaker
Georgia Institute of Technology, USA Visiting Student 2014– 2015
Deep neural networks based speech enhancement and used for automatic speech recognition (ASR), and my advisor is Prof. Chin-Hui Lee.
Bosch - research center, CA, USA Intern Oct. 2014– Nov. 2014
Deep neural networks based speech enhancement and used for the automatic speech recognition (ASR). Supervised by Dr. Pongtep Angkititrakul and Dr. Fuliang Weng
National Engineering Research Center of Speech and Language Information Processing, University of Science and Technology of China (USTC), China Jul. 2012 – Jun. 2015
DNN-based speech enhancement, co-supervised by Prof. Chin-Hui Lee (Georgia Tech)
I developed a large vocabulary continuous speech recognition (LVCSR) system trained on a 2300-h English speech database and built a baseline for OOV term detection. I also built MLE, DT, and Tandem systems.
National Engineering Research Center of Speech and Language Information Processing, University of Science and Technology of China (USTC), China Graduate student Sept. 2010 – Jul. 2012
Working on Spoken Term Detection (STD) for Out-Of-Vocabulary (OOV) words, I use the tri-phone confusion matrix and a hybrid fragment/syllable system to improve the performance of OOV term detection.
National Engineering Research Center of Speech and Language Information Processing, University of Science and Technology of China (USTC), China Undergraduate student Mar. 2010 – Jul. 2010
I did the project for my undergraduate thesis about the room acoustic impulse response.