Research
I serve as a reviewer for ICASSP, Interspeech, IJCNN, EUSIPCO, DSP, Audio Engineering Society conference, IEEE /ACM Transactions on Audio, Speech and Language Processing, IEEE signal processing letters, Neurocomputing, IEEE Journal of Selected Topics in Signal Processing, Speech communication, IEEE Access, IEEE Transactions on Emerging Topics in Computational Intelligence, IEEE Transactions on signal processing, IEEE Transactions on multimedia, IEEE Open Journal of Signal Processing, etc.
I have two ESI highly cited papers (IEEE journal). I achieved 2018 IEEE SPS Best paper award for my work on deep learning based speech enhancement. I am the IEEE Signal Processing Society - Speech and Language Technical Committee (SLTC) member (2023-2025). I am one of the World's Top 2% Scientists in 2022 ranked by Stanford University. I am an IEEE Senior Member.
Tencent America LLC, Bellevue, WA, USA Principle Research Scientist 2021 – present Multi-channel speech enhancement/separation/de-reverberation/speech recognition, I proposed ADL-MVDR/RNN beamformer
Tencent America LLC, Bellevue, WA, USA Senior Research Scientist 2018 – 2021 Multi-modality speech enhancement/separation/de-reverberation/speech recognition
University of Surrey, Guildford, UK Full-time Research Fellow 2016 – 2018 Deep learning (DNN/CNN/LSTM, attention, reinforcement learning, generative adversarial network, etc) based environmental sound classification and analysis.
Georgia Institute of Technology, USA Visiting Student 2014– 2015
Deep neural networks based speech enhancement and used for the automatic speech recognition (ASR), and my advisor is Prof. Chin-Hui Lee.
Bosch - research center, CA, USA Short Internship Sept. 2014– Oct. 2014
Deep neural networks based speech enhancement and used for the automatic speech recognition (ASR)
Speech Lab, USTC, China Jul. 2012 – Jun. 2015
--- DNN based speech enhancement, cooperated with Prof. Chin-Hui Lee (Georgia Tech)
---I developed a Large Vocabulary Continuous Speech Recognition (LVCSR) system trained on 2300h English speech database, and built a baseline for OOV term detection. MLE, DT, Tandem systems were built.
Speech Lab, USTC, Hefei, China Graduate student Sept. 2010 – Jul. 2012
Working on Spoken Term Detection (STD) for Out-Of-Vocabulary (OOV) words, I use the tri-phone confusion matrix and a hybrid fragment / syllable system to improve the performance of OOV term detection.
Speech Lab, USTC, Hefei, China Undergraduate student Mar. 2010 – Jul. 2010
I did the project of my undergraduate thesis about room acoustic impulse response.