My name is Hao Cheng .
I'm a researcher at Microsoft Research and Affiliate Faculty at the University of Washington.
Prior to this, I completed my PhD at the University of Washington working with Mari Ostendorf, and got my MSc under the supervision of Dale Schuurmans and Csaba Szepesvári at the University of Alberta. I am a pround member of Sounding Board the 2017 Alexa Prize Winner!
Email (for company related): {my_last_name}.Hao@microsoft.com
Email (others): {my_first_name}cheng@outlook.com
[Recruiting] Are you passionate about the world of test-time scaling and self-improving models? We're on the lookout for talented interns and students to join our dynamic team at Microsoft Research! Feel free to email me if you're interested.
Updates:
Test-time scaling can be the next thing to unlock the self-improvement in LLMs. Checkout out our recent effort in search-augmented learning and on-the-fly parameter update.
Fail to prompt LLMs to reliably use context for generation? Our paper proposed an effective solution: combining text prompting with attention Ops.
Too costly to inference with MoE models? Keep the diversity when pruning. Check out our MoE pruning method with improved efficiency and mininal performance drop.
Curious about the proper scaling of MoE models with scalable training? Checkout out GRIN with techniques and insights into building Phi-MoE.
Too much cost for handling long-context if there are repeated queries for a common prefix (e.g., long document/conversation)? Try parallel context encoding! Check out PiD, a new method for efficient long-context handling.
How to orchestrate different tools and LMs for various task-solving? Checkout out LLaVA-Plus and OrchestraLLM.
Professional Service
Organizing Committee
Volunteer Chairs for NAACL 2021
Program Committee & Editorial Team
Area Chair/Meta-Reviewer: ACL Roling Review (2024), ACL (2023), EMNLP(2023, 2022), AAAI (2023), COLING (2022)
Reviewer:
--[Journal] Transactions of the Association for Computational Linguistics (TACL)
--[Conference] ICLR (2025), NeurIPS (2023, 2024), ACL Roling Review (2021), ACL (2017-2022), EMNLP (2019-2021), NAACL (2019, 2021), AACL (2020), COLING (2018), IJCAI (2015).
[Preprint]
Generative Adapter: Contextualizing Language Models in Parameters with A Single Forward Pass
Tong Chen, Hao Fang, Patrick Xia, Xiaodong Liu, Benjamin Van Durme, Luke Zettlemoyer, Jianfeng Gao, Hao Cheng.
ExACT: Teaching AI Agents to Explore with Reflective-MCTS and Exploratory Learning
Xiao Yu, Baolin Peng, Vineeth Vajipey, Hao Cheng, Michel Galley, Jianfeng Gao, Zhou Yu.
ReasonAgain: Using Extractable Symbolic Programs to Evaluate Mathematical Reasoning
Xiaodong Yu, Ben Zhou, Hao Cheng, Dan Roth.
CorrectionLM: Self-Corrections with SLM for Dialogue State Tracking
Chia-Hsuan Lee, Hao Cheng, Mari Ostendorf.
LoRC: Low-Rank Compression for LLMs KV Cache with a Progressive Compression Strategy
Rongzhi Zhang, Kuang Wang, Liyuan Liu, Shuohang Wang, Hao Cheng, Chao Zhang, Yelong Shen.
Iterative Self-Tuning LLMs for Enhanced Jailbreaking Capabilities
Chung-En Sun, Xiaodong Liu, Weiwei Yang, Tsui-Wei Weng, Hao Cheng, Aidan San, Michel Galley, Jianfeng Gao.
Diversifying the Expert Knowledge for Task-Agnostic Pruning in Sparse Mixture-of-Experts
Zeliang Zhang, Xiaodong Liu, Hao Cheng, Chenliang Xu, Jianfeng Gao.
Model Tells Itself Where to Attend: Faithfulness Meets Automatic Attention Steering
Qingru Zhang, Xiaodong Yu, Chandan Singh, Xiaodong Liu, Liyuan Liu, Jianfeng Gao, Tuo Zhao, Dan Roth, Hao Cheng.
Liyuan Liu, Young Jin Kim, Shuohang Wang, Chen Liang, Yelong Shen, Hao Cheng, Xiaodong Liu, Masahiro Tanaka, Xiaoxia Wu, Wenxiang Hu, Vishrav Chaudhary, Zeqi Lin, Chenruidong Zhang, Jilong Xue, Hany Awadalla, Jianfeng Gao, Weizhu Chen.
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Microsoft.
Efficient Encoder-Decoder Transformer Decoding for Decomposable Tasks
Bo-Ru Lu, Nikita Haduong, Chien-Yu Lin, Hao Cheng, Noah A. Smith, Mari Ostendorf.
[2024]
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents
Shilong Liu, Hao Cheng, Haotian Liu, Hao Zhang, Feng Li, Tianhe Ren, Xueyan Zou, Jianwei Yang, Hang Su, Jun Zhu, Lei Zhang, Jianfeng Gao, Chunyuan Li.
In Proc. European Conference on Computer Vision (ECCV), 2024.
Does Collaborative Human-LM Dialogue Generation Help Information Extraction from Human Dialogues?
Bo-Ru Lu, Nikita Haduong, Chia-Hsuan Lee, Zeqiu Wu, Hao Cheng, Paul Koester, Jean Utke, Tao Yu, Noah A. Smith, Mari Ostendorf.
In Proc. Conference on Language Modeling (COLM), 2024.
DocLens: Multi-aspect Fine-grained Evaluation for Medical Text Generation
Yiqing Xie, Sheng Zhang, Hao Cheng, Pengfei Liu, Zelalem Gero, Cliff Wong, Tristan Naumann, Hoifung Poon, Carolyn Rose.
In Proc. Assoc. for Computational Linguistics (ACL), 2024.
OrchestraLLM: Efficient Orchestration of Language Models for Dialogue State Tracking
Chia-Hsuan Lee, Hao Cheng, Mari Ostendorf.
In Proc. Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), 2024.
Xiaodong Yu, Hao Cheng, Xiaodong Liu, Dan Roth, Jianfeng Gao.
In Findings of Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT-Findings), 2024.
MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts
Pan Lu, Hritik Bansal, Tony Xia, Jiacheng Liu, Chunyuan Li, Hannaneh Hajishirzi, Hao Cheng, Kai-Wei Chang, Michel Galley, Jianfeng Gao.
In Proc. International Conference on Learning Representations (ICLR), 2024.
Fast-ELECTRA for Efficient Pre-training
Chengyu Dong, Liyuan Liu, Hao Cheng, Jingbo Shang, Jianfeng Gao, Xiaodong Liu.
In Proc. International Conference on Learning Representations (ICLR), 2024.
Language Models as Inductive Reasoners
Zonglin Yang, Li Dong, Xinya Du, Hao Cheng, Erik Cambria, Xiaodong Liu, Jianfeng Gao, Furu Wei.
In Proc. Conf. of the European Chapter of Assoc. for Computational Linguistics (EACL), 2024.
[2023]
Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models
Pan Lu, Baolin Peng, Hao Cheng, Michel Galley, Kai-Wei Chang, Ying Nian Wu, Song-Chun Zhu, Jianfeng Gao.
In Proc. of the Neural Information Processing Systems (NeurIPS), 2023.
Augmenting Language Models with Long-Term Memory
Weizhi Wang, Li Dong, Hao Cheng, Xiaodong Liu, Xifeng Yan, Jianfeng Gao, Furu Wei.
In Proc. of the Neural Information Processing Systems (NeurIPS), 2023.
Pre-training Multi-task Contrastive Learning Models for Scientific Literature Understanding
Yu Zhang*, Hao Cheng*, Zhihong Shen, Xiaodong Liu, Ye-Yi Wang, Jianfeng Gao.
In Findings of Conf. Empirical Methods in Natural Language Processing (EMNLP-Findings), 2023.
Understand and Modularize Generator Optimization in ELECTRA-style Pretraining
Chengyu Dong, Liyuan Liu, Hao Cheng, Jingbo Shang, Jianfeng Gao, Xiaodong Liu.
In Proc. International Conference on Machine Learning (ICML), 2023.
Chain-of-Skills: A Configurable Model for Open-domain Question Answering
Kaixin Ma*, Hao Cheng*, Yu Zhang, Xiaodong Liu, Eric Nyberg, Jianfeng Gao. [*Equal contribution]
In Proc. Assoc. for Computational Linguistics (ACL), 2023.
Hao Cheng, Hao Fang, Xiaodong Liu, Jianfeng Gao.
In Proc. Assoc. for Computational Linguistics (ACL), 2023.
Fine-Tuning Large Neural Language Models for Biomedical Natural Language Processing
Robert Tinn*, Hao Cheng*, Yu Gu, Naoto Usuyama, Xiaodong Liu, Tristan Naumann, Jianfeng Gao, Hoifung Poon. [*Equal contribution]
Patterns, 2023
Optimizing Bi-Encoder for Named Entity Recognition via Contrastive Learning [Code]
Sheng Zhang, Hao Cheng, Jianfeng Gao, Hoifung Poon.
In Proc. International Conference on Learning Representations (ICLR), 2023.
Visually-Augmented Language Modeling
Weizhi Wang, Li Dong, Hao Cheng, Haoyu Song, Xiaodong Liu, Xifeng Yan, Jianfeng Gao, Furu Wei.
In Proc. International Conference on Learning Representations (ICLR), 2023.
INSCIT: Information-Seeking Conversations with Mixed-Initiative Interactions [Data]
Zeqiu Wu, Ryu Parish, Hao Cheng, Sewon Min, Prithviraj Ammanabrolu, Mari Ostendorf, Hannaneh Hajishirzi.
Transactions of the Association for Computational Linguistics (TACL), 2023.
Self-Verification Improves Few-Shot Clinical Information Extraction
Zelalem Gero, Chandan Singh, Hao Cheng, Tristan Naumann, Michel Galley, Jianfeng Gao, Hoifung Poon.
ICML 3rd Workshop on Interpretable Machine Learning in Healthcare (IMLH), 2023.
[2022]
Open-domain Question Answering via Chain of Reasoning over Heterogeneous Knowledge [Code]
Kaixin Ma*, Hao Cheng*, Xiaodong Liu, Eric Nyberg, Jianfeng Gao. [*Equal contribution]
In Findings of Conf. Empirical Methods in Natural Language Processing (EMNLP-Findings), 2022.
Knowledge-Rich Self-Supervision for Biomedical Entity Linking [Model]
Sheng Zhang*, Hao Cheng*, Shikhar Vashishth*, Cliff Wong, Jinfeng Xiao, Xiaodong Liu, Tristan Naumann, Jianfeng Gao, Hoifung Poon. [*Equal contribution]
In Findings of Conf. Empirical Methods in Natural Language Processing (EMNLP-Findings), 2022.
Unsupervised Learning of Hierarchical Conversation Structure [Code]
Bo-Ru Lu, Yushi Hu, Hao Cheng, Noah A Smith, Mari Ostendorf
In Findings of Conf. Empirical Methods in Natural Language Processing (EMNLP-Findings), 2022.
Open Domain Question Answering with A Unified Knowledge Interface [Code]
Kaixin Ma*, Hao Cheng*, Xiaodong Liu, Eric Nyberg, Jianfeng Gao. [*Equal contribution]
In Proc. Assoc. for Computational Linguistics (ACL), 2022.
Human Parity on CommonsenseQA: Augmenting Self-Attention with External Attention
Yichong Xu, Chenguang Zhu, Shuohang Wang, Siqi Sun, Hao Cheng, Xiaodong Liu, Jianfeng Gao, Pengcheng He, Michael Zeng, Xuedong Huang.
In Proc. International Joint Conference on Artificial Intelligence (IJCAI), 2022.
[2021]
CLUES: Few-Shot Learning Evaluation in Natural Language Understanding
Subhabrata Mukherjee, Xiaodong Liu, Guoqing Zheng, Saghar Hosseini, Hao Cheng, Ge Yang, Christopher Meek, Ahmed Awadallah, Jianfeng Gao.
In Proc. of the Neural Information Processing Systems Track on Datasets and Benchmarks (NeurIPS Datasets and Benchmarks), 2021.
Dialogue State Tracking with a Language Model using Schema-Driven Prompting [Code]
Chia-Hsuan Lee, Hao Cheng, Mari Ostendorf.
In Proc. Conf. Empirical Methods in Natural Language Processing (EMNLP), 2021.
Domain-Specific Pretraining for Vertical Search: Case Study on Biomedical Literature
Yu Wang*, Jinchao Li*, Tristan Naumann*, Chenyan Xiong, Hao Cheng, Robert Tinn, Cliff Wong, Naoto Usuyama, Richard Rogahn, Zhihong Shen, Yang Qin, Eric Horvitz, Paul N. Bennett, Jianfeng Gao, and Hoifung Poon. [*Equal contribution]
In Proc. of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining (KDD '21)
UnitedQA: A Hybrid Approach for Open Domain Question Answering [Code]
Hao Cheng*, Yelong Shen*, Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao. [*Equal contribution]
In Proc. Assoc. for Computational Linguistics (ACL), 2021.
Posterior Differential Regularization with f-divergence for Improving Model Robustness [Code]
Hao Cheng, Xiaodong Liu, Lis Pereira, Yaoliang Yu, Jianfeng Gao.
In Proc. Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), 2021.
Targeted Adversarial Training for Natural Language Understanding
Lis Pereira*, Xiaodong Liu*, Hao Cheng, Hoifung Poon, Jianfeng Gao, Ichiro Kobayashi.
In Proc. Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), 2021. [*Equal contribution]
Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing
Yu Gu*, Robert Tinn*, Hao Cheng*, Michael Lucas, Naoto Usuyama, Xiaodong Liu, Tristan Naumann, Jianfeng Gao, Hoifung Poon. 2021 [*Equal contribution]
ACM Transactions on Computing for Healthcare
[2020]
Probabilistic Assumptions Matter: Improved Models for Distantly-Supervised Document-Level Question Answering [Code]
Hao Cheng, Ming-Wei Chang, Kenton Lee, Kristina Toutanova.
In Proc. Assoc. for Computational Linguistics (ACL), 2020
The microsoft toolkit of multi-task deep neural networks for natural language understanding
Xiaodong Liu, Yu Wang, Jianshu Ji, Hao Cheng, Xueyun Zhu, Emmanuel Awa, Pengcheng He, Weizhu Chen, Hoifung Poon, Guihong Cao, Jianfeng Gao.
In Proc. Assoc. for Computational Linguistics (ACL), demo, 2020
Adversarial training for large neural language models
Xiaodong Liu, Hao Cheng, Pengcheng He, Weizhu Chen, Yu Wang, Hoifung Poon, Jianfeng Gao. 2020
[Selected Before 2020]:
A Dynamic Speaker Model for Conversational Interactions [Code]
Hao Cheng, Hao Fang, Mari Ostendorf.
In Proc. Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), 2019.
Sounding Board: A User-Centric and Content-Driven Social Chatbot
Hao Fang, Hao Cheng, Maarten Sap, Elizabeth Clark, Ari Holtzman, Yejin Choi, Noah A Smith, Mari Ostendorf.
In Proc. Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), demo, 2018.
Bi-directional Attention with Agreement for Dependency Parsing [Code]
Hao Cheng, Hao Fang, Xiaodong He, Jianfeng Gao, Li Deng.
In Proc. Conf. Empirical Methods in Natural Language Processing (EMNLP), 2016.
Scalable and Sound Low-Rank Tensor Learning [Code]
Hao Cheng, Yaoliang Yu, Xinhua Zhang, Eric Xing, Dale Schuurmans.
In Proc. Conf. Artificial Intelligence and Statistics (AISTATS), 2016.
Open-Domain Name Error Detection using a Multi-Task RNN.
Hao Cheng, Hao Fang, Mari Ostendorf.
In Proc. Conf. Empirical Methods in Natural Language Processing (EMNLP), 2015.
Code
Teaching @ UW
[Instructor][Grad] E596/LING: 580 Conversational AI (course webpage) [Spring 2019]
[TA][Grad] E596/LING 580: Conversational AI (course webpage) [Spring 2018]
[TA] [Grad] EE511: Introduction to Statistical Learning (course webpage) [Winter 2018]
[TA] [Undergrad] EE 235: Continuous-time Linear Systems [Autumn 2017]
[TA] [Undergrad] EE 341: Discrete-Time Linear Systems [Spring 2016]