I am currently a Technical Staff Member at DeepSeek AI, where I lead the LLM Alignment Team. My team focuses on advancing model capabilities in core areas including Writing, QA, AI Search, General Reasoning, and Safety. My primary research goal is to build generic artificial general intelligence (AGI) through scaling laws and reinforcement learning.
At DeepSeek, I am deeply involved in the development of the DeepSeek model family, including DeepSeek-V3, DeepSeek-R1, and DeepSeek Math. Most notably, my team pioneered the Group Relative Policy Optimization (GRPO) algorithm and DeepSeek-R1-Zero. These technical breakthroughs laid the foundation for DeepSeek-R1, our work published in Nature, which demonstrates the feasibility of incentivizing complex reasoning capabilities in LLMs via pure reinforcement learning without supervised fine-tuning.
Before joining DeepSeek, I was a Senior Researcher at the Natural Language Computing Group, Microsoft Research Asia (MSRA). During my tenure there, I developed foundational models including WavLM, which was recognized with the IEEE Signal Processing Society (SPS) Best Paper Award. I received my Ph.D. and B.S. degrees from Beihang University, advised by Prof. Ming Zhou and Prof. Zhoujun Li.
To date, I have published over 100 papers in top-tier conferences and journals such as Nature, NeurIPS, ICML, ICLR, ACL, and EMNLP, with more than 30,000 citations on Google Scholar. I was recognized as one of the Top 50 Chinese Young Scholars by Baidu (2022) and received the UNESCO Netexplo Innovation Award (2023).
Email: wumark at 126 dot com
Sept 2023 - Now, Head of LLM Alignment Team, DeepSeek AI.
March 2021 - Sept 2023, Senior Researcher, Natural Language Computing Group, Microsoft Research Asia.
June. 2019 - March 2021, Researcher, Natural Language Computing Group, Microsoft Research Asia.
July. 2013 - Dec. 2017, Natural Language Computing Group, Microsoft Research Asia. Mentor: Wei Wu
Jan. 2018 - Aug. 2018, Natural Language Computing Group, Microsoft Research Asia. Mentor: Furu Wei
Sept. 2018 - May. 2019, Natural Language Computing Group, Microsoft Research Asia. Mentor: Shujie Liu
IEEE Signal Processing Society (SPS) Best Paper Award
Top 10 most significant Innovations, Netexplo, 2023
Global Top 50 Chinese Young Scholars in NLP, Baidu, 2022
InterSpeech Best Student Paper Nomination 2021
AdeptMind ScholarShip 2018
Microsoft Research Asia Ph.D. Fellowship, 2017
National Scholarship, Beihang University, 2015, 2018