Research

I serve as a reviewer for ICASSP, Interspeech, IJCNN, EUSIPCO, DSP, Audio Engineering Society conference, IEEE /ACM Transactions on Audio, Speech and Language Processing, IEEE signal processing letters, Neurocomputing, IEEE Journal of Selected Topics in Signal Processing, Speech communication, IEEE Access, IEEE Transactions on Emerging Topics in Computational Intelligence, IEEE Transactions on signal processing, IEEE Transactions on multimedia, IEEE Open Journal of Signal Processing, etc.

I have two ESI highly cited papers (IEEE journal). I achieved 2018 IEEE SPS Best paper award for my work on deep learning based speech enhancement.  I am the IEEE Signal Processing Society - Speech and Language Technical Committee (SLTC) member (2023-2025). I am one of the World's Top 2% Scientists in 2022 ranked by Stanford University. I am an IEEE Senior Member.

Tencent America LLC, Bellevue, WA, USA    Principle Research Scientist   2021 – present  Multi-channel speech enhancement/separation/de-reverberation/speech recognition, I proposed ADL-MVDR/RNN beamformer

Tencent America LLC, Bellevue, WA, USA    Senior Research Scientist   2018 – 2021  Multi-modality speech enhancement/separation/de-reverberation/speech recognition

University of Surrey, Guildford, UK    Full-time Research Fellow    2016 – 2018 Deep learning (DNN/CNN/LSTM, attention, reinforcement learning, generative adversarial network, etc) based environmental sound classification and analysis.

Georgia Institute of Technology, USA  Visiting Student     2014– 2015

Deep neural networks based speech enhancement and used for the automatic speech recognition (ASR), and my advisor is Prof. Chin-Hui Lee.

Bosch - research center, CA, USA Short Internship    Sept. 2014– Oct. 2014

Deep neural networks based speech enhancement and used for the automatic speech recognition (ASR)

Speech Lab, USTC, China Jul. 2012 – Jun. 2015

--- DNN based speech enhancement, cooperated with Prof. Chin-Hui Lee (Georgia Tech)

---I developed a Large Vocabulary Continuous Speech Recognition (LVCSR) system trained on 2300h English speech database, and built a baseline for OOV term detection. MLE, DT, Tandem systems were built.

Speech Lab, USTC, Hefei, China     Graduate student Sept. 2010 – Jul. 2012

Working on Spoken Term Detection (STD) for Out-Of-Vocabulary (OOV) words, I use the tri-phone confusion matrix and a hybrid fragment / syllable system to improve the performance of OOV term detection.

Speech Lab, USTC, Hefei, China     Undergraduate student        Mar. 2010 – Jul. 2010

I did the project of my undergraduate thesis about room acoustic impulse response.