Xiaodong Liu

Hello! I am currently working at Microsoft Research.

I received a Ph.D. degree from Nara Institute of Science and Technology (NAIST), in 2015.

Email:xx@microosft.com; xx=xiaodl.

I'm looking for interns working on LM pre-training and model compression at Redmond/Beijing. Please feel free to drop me an email.

Research

I am interested in Statistical Natural Language Processing, Machine Learning and Deep Learning.


Education

2011.10 ~ 2015.3

PhD. in Computational Linguistics, Nara Institute of Science and Technology.

Supervisor: Prof. Yuji Matsumoto and Prof. Kevin Duh.

2008.9 ~ 2011.3

M.E. in Signal and Information Processing, Beijing University of Posts and Communications (BUPT).

Supervisor: Prof. Fuji Ren and Prof. Xiaojie Wang.

2009.4 ~ 2011.4

M.E. in Systems Innovation Engineering, University of Tokushima.

Supervisor: Prof. Fuji Ren.

2004.9 ~ 2008.7

B.S. in Information Engineering, Beijing University of Posts and Communications (BUPT).


Experience

Microsoft AI&Research, Redmond, USA, 2017.5 - present.

Microsoft Research Asian, Beijing, China, 2015.4 - 2017.5.

Microsoft Research, Redmond, USA, 2014.9 ~ 2014.11

Microsoft Research, Redmond, USA, 2014.3 ~ 2014.6

Fujitsu Laboratory, Kawasaki, Japan, 2012.7 ~ 2012.10

Toshiba R&D Center, Beijing, China, 2011.3 ~ 2011.7

IBM, Beijing, China, 2010.7 ~ 2010.9


Professional Activities

Conference Reviewer/Program Committee: TACL, ACL, SIGIR, EMNLP, NAACL, NIPS, COLING, AAAI, ICLR, IJCNLP, NLPKE


Publication

Xiaodong Liu, Hao Cheng, Pengcheng He, Weizhu Chen, Yu Wang, Hoifung Poon and Jianfeng Gao

Adversarial Training for Large Neural Language Models

arXiv: https://arxiv.org/abs/2004.08994

Code: https://github.com/namisan/mt-dnn/tree/master/alum


Liyuan Liu, Xiaodong Liu, Jianfeng Gao, Weizhu Chen and Jiawei Han

Understanding the Difficulty of Training Transformers

arXiv: https://arxiv.org/abs/2004.08249

Code: https://github.com/LiyuanLucasLiu/Transforemr-Clinic


Hangbo Bao, Li Dong, Furu Wei, Wenhui Wang, Nan Yang, Xiaodong Liu, Yu Wang, Songhao Piao, Jianfeng Gao, Ming Zhu and Hsiao-Wuen Ho

UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-training

arXiv: https://arxiv.org/abs/2002.12804

ICML 2020.


RAT-SQL: Relation-Aware Schema Encoding and Linking for Text-to-SQL Parser

Bailin Wang, Richard Shin, Xiaodong Liu, Oleksandr Polozov, Matthew Richardson

arXiv: https://arxiv.org/abs/1911.04942

ACL 2020, Seattle, USA

Rank #1 on Spider Dataset: https://yale-lily.github.io/spider


SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization

Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao, Tuo Zhao

https://arxiv.org/abs/1911.03437

ACL 2020, Seattle, USA

Rank #1 on GLUE on Dec 2019.


The Microsoft Toolkit of Multi-Task Deep Neural Networks for Natural Language Understanding

Xiaodong Liu, Yu Wang, Jianshu Ji, Hao Cheng, Xueyun Zhu, Emmanuel Awa, Pengcheng He, Weizhu Chen, Hoifung Poon, Guihong Cao and Jianfeng Gao

https://arxiv.org/abs/2002.07972

ACL 2020 Demo, Seattle, USA


On the Variance of Adaptive Learning Rate and Beyond

Liyuan Liu, Haoming Jiang, pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao and Jiawei Han

https://arxiv.org/abs/1908.03265

ICRL 2020, Ethiopia

Code: https://github.com/LiyuanLucasLiu/RAdam


Improving Multi-Task Deep Neural Networks via Knowledge Distillation for Natural Language Understanding

Xiaodong Liu, Pengcheng He, Weizhu Chen and Jianfeng Gao

https://arxiv.org/abs/1904.09482

Code will be released: github.com/namisan/mt-dnn

Rank #1 on GLUE TASK on April 1, 2019, Leaderboard.

Achieved Human Parity on June 6, 2019, Leaderboard.


Multi-Task Deep Neural Networks for Natural Language Understanding

Xiaodong Liu*, Pengcheng He*, Weizhu Chen and Jianfeng Gao

https://arxiv.org/abs/1901.11504

Code: github.com/namisan/mt-dnn

Rank #1 on GLUE TASK as Dec 18, 2018 (1.8 Absolute Improvement over BERT) Leaderboard.

Rank #1 on SNLI as Feb 7, 2019, Leaderboard.

Rank #1 on SciTail as Jan 19, 2019, Leaderboard.

Blog post: English, Chinese

ACL 2019, Florence, Italy.


Unified Language Model Pre-training for Natural Language Understanding and Generation

Li Dong, Nan Yang, Wenhui Wang, Furu Wei, Xiaodong Liu, Yu Wang, Jianfeng Gao, Ming Zhou and Hsiao-Wuen Hon

https://arxiv.org/abs/1905.03197

NeurIPS 2019, Vancouver, Canada.


Adversarial Domain Adaptation for Machine Reading Comprehension,

Huazheng Wang, Zhe Gan, Xiaodong Liu, Jingjing Liu, Jianfeng Gao, Hongning Wang

EMNLP 2019, Hong Kong, China.


A Hybrid Retrieval-Generation Neural Conversation Model

Liu Yang, Junjie Hu, Minghui Qiu, Chen Qu, Jianfeng Gao, W Bruce Croft, Xiaodong Liu, Yelong Shen and Jingjing Liu

CIKM 2019, Beijing, China.


Stochastic Answer Networks for Natural Language Inference

Xiaodong Liu, Kevin Duh and Jianfeng Gao

http://arxiv.org/abs/1804.07888

Code: github.com/namisan/mt-dnn

Rank #1 on MS-MARCO Ranking as Jan 22, 2019, Leaderboard.


Conversing by Reading: Contentful Neural Conversation with On-demand Machine Reading

Lianhui Qin, Michel Galley, Chris Brockett, Xiaodong Liu, Xiang Gao, Bill Dolan, Yejin Choi and Jianfeng Gao

ACL 2019, Florence, Italy.


Multi-task Learning with Sample Re-weighting for Machine Reading Comprehension

Yichong Xu, Xiaodong Liu, Yelong Shen, Jingjing Liu and Jianfeng Gao

2019 Conference of the North American Chapter of the Association for Computational Linguistics-Human Language Technologies (NAACL), Minneapolis, USA, 2019.


Cyclical Annealing Schedule: A Simple Approach to Mitigate KL Vanishing

Hao Fu, Chunyuan Li, Xiaodong Liu, Jianfeng Gao, Asli Celikyilmaz and Lawrence Carin

2019 Conference of the North American Chapter of the Association for Computational Linguistics-Human Language Technologies (NAACL), Minneapolis, USA, 2019.


Weakly-Supervised Deep Structured Semantic Models for Commonsense Reasoning

Shuohang Wang, Sheng Zhang, Yelong Shen, Xiaodong Liu, Jingjing Liu and Jianfeng Gao

2019 Conference of the North American Chapter of the Association for Computational Linguistics-Human Language Technologies (NAACL), Minneapolis, USA, 2019.


ReCoRD: Bridging the Gap between Human and Machine Commonsense Reading Comprehension

Sheng Zhang, Xiaodong Liu, Jingjing Liu and Jianfeng Gao.

arXiv preprint arXiv:1810.12885

Leaderboard: https://sheng-z.github.io/ReCoRD-explorer/


Navigating with Graph Representations for Fast and Scalable Decoding of Neural Language Models

Minjia Zhang, Xiaodong Liu, Wenhan Wang, Jianfeng Gao and Yuxiong He

NIPS 2018,

http://papers.nips.cc/paper/7868-navigating-with-graph-representations-for-fast-and-scalable-decoding-of-neural-language-models


Stochastic Answer Networks for Machine Reading Comprehension

Xiaodong Liu, Yelong Shen, Kevin Duh and Jianfeng Gao

ACL 2018, Code

https://arxiv.org/abs/1712.03556.


Language-Based Image Editing with Recurrent Attentive Models

Jianbo Chen, Yelong Shen, Jianfeng Gao, Jingjing Liu and Xiaodong Liu,

CVPR, 2018.

https://arxiv.org/abs/1711.06288


Towards Human-level Machine Reading Comprehension: Reasoning and Inference with Multiple Strategies

Yichong Xu, Jingjing Liu, Jianfeng Gao, Yelong Shen and Xiaodong Liu,

arXiv preprint 2017. https://arxiv.org/abs/1711.04964


An Empirical Analysis of Multiple-Turn Reasoning Strategies in Reading Comprehension Tasks,

Yelong Shen, Xiaodong Liu, Kevin Duh and Jianfeng Gao,

Proceedings of the Eighth International Joint Conference on Natural Language Processing, 2017


Lexical Simplification with the Deep Structured Similarity Model

Lis Pereira, Xiaodong Liu and John Lee,

Proceedings of the Eighth International Joint Conference on Natural Language Processing, 2017


Representation Learning Using Multi-Task Deep Neural Networks

Xiaodong Liu, Jianfeng Gao, Xiaodong He, Li Deng, Kevin Duh and Ye-Yi Wang,

US Patent App. 14/811,808


Representation Learning Using Multi-Task Deep Neural Networks for Semantic Classification and Information Retrieval

Xiaodong Liu, Jianfeng Gao, Xiaodong He, Li Deng, Kevin Duh and Ye-Yi Wang

2015 Conference of the North American Chapter of the Association for Computational Linguistics-Human Language Technologies

(NAACL-HLT).


A Hybrid Ranking Approach to Chinese Spelling Check

Xiaodong Liu*, Fei Cheng*, Kevin Duh and Yuji Matsumoto (*equal contribution)

ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), to publish.


Multilingual Topic Models for Bilingual Dictionary Extraction

Xiaodong Liu, Kevin Duh and Yuji Matsumoto,

ACM Transactions on Asian Language Information Processing (TALIP), Vol.14, No.3, May 2015.


Learning Character Representations for Chinese Word Segmentation

Xiaodong Liu, Kevin Duh, Tomoya Iwakura and Yuji Matsumoto.

NIPS 2014 Workshop on Modern Machine Learning and Natural Language Processing.


A Hybrid Chinese Spelling Correction System Using Language Model and Statistical Machine Translation with Reranking

Xiaodong Liu, Fei Cheng, Yanyan Luo, Kevin Duh and Yuji Matsumoto

7th SIGHAN Workshop on Chinese Language Processing, Nagoya, Japan, 2013.


Topic Models + Word Alignment = A Flexible Framework for Extracting Bilingual Dictionary from Comparable Corpus

Xiaodong Liu, Kevin Duh and Yuji Matsumoto

Seventeenth Conference on Computational Natural Language Learning (CoNLL13),Sofia, Bulgaria, 2013.


A Novel Joint Model of Word Alignment and Hierarchical Dirichlet Process for Statistical Machine Translation

Xiaodong Liu

9th Conference on Bayesian Nonparametrics, Poster section, Amsterdam, Netherlands, 2013.


Use Relative Weight to Improve the kNN for Unbalanced Text Category.

Xiaodong Liu, Caixia Yuan and Fuji Ren

6th IEEE International Conference on Natural Language Processing and Knowledge Engineering, Beijing, China, 2010.


Language

Chinese: native

Japanese: JLPT N1

English