Xiaodong Liu
Hello! I am currently working at Microsoft Research.
I received my Ph.D. from Nara Institute of Science and Technology (NAIST).
Email:xx@microsoft.com; xx=xiaodl.
Research
I am interested in Statistical Natural Language Processing, Machine Learning and Deep Learning.
Education
2011.10 ~ 2015.3
PhD. in Computational Linguistics, Nara Institute of Science and Technology.
Supervisor: Prof. Yuji Matsumoto and Prof. Kevin Duh.
2008.9 ~ 2011.3
M.E. in Signal and Information Processing, Beijing University of Posts and Communications (BUPT).
Supervisor: Prof. Fuji Ren and Prof. Xiaojie Wang.
2009.4 ~ 2011.4
M.E. in Systems Innovation Engineering, University of Tokushima.
Supervisor: Prof. Fuji Ren.
2004.9 ~ 2008.7
B.S. in Information Engineering, Beijing University of Posts and Communications (BUPT).
Experience
Microsoft AI&Research, Redmond, USA, 2017.5 - present.
Microsoft Research Asian, Beijing, China, 2015.4 - 2017.5.
Microsoft Research, Redmond, USA, 2014.9 ~ 2014.11
Microsoft Research, Redmond, USA, 2014.3 ~ 2014.6
Fujitsu Laboratory, Kawasaki, Japan, 2012.7 ~ 2012.10
Toshiba R&D Center, Beijing, China, 2011.3 ~ 2011.7
IBM, Beijing, China, 2010.7 ~ 2010.9
Professional Activities
Conference Reviewer/Program Committee: TACL, ACL, SIGIR, EMNLP, NAACL, NIPS, COLING, AAAI, ICLR, IJCNLP, NLPKE
Publication
Xiaodong Liu, Hao Cheng, Pengcheng He, Weizhu Chen, Yu Wang, Hoifung Poon and Jianfeng Gao
Adversarial Training for Large Neural Language Models
arXiv: https://arxiv.org/abs/2004.08994
Code: https://github.com/namisan/mt-dnn/tree/master/alum
Liyuan Liu, Xiaodong Liu, Jianfeng Gao, Weizhu Chen and Jiawei Han
Understanding the Difficulty of Training Transformers
arXiv: https://arxiv.org/abs/2004.08249
Code: https://github.com/LiyuanLucasLiu/Transforemr-Clinic
Hangbo Bao, Li Dong, Furu Wei, Wenhui Wang, Nan Yang, Xiaodong Liu, Yu Wang, Songhao Piao, Jianfeng Gao, Ming Zhu and Hsiao-Wuen Ho
UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-training
arXiv: https://arxiv.org/abs/2002.12804
ICML 2020.
RAT-SQL: Relation-Aware Schema Encoding and Linking for Text-to-SQL Parser
Bailin Wang, Richard Shin, Xiaodong Liu, Oleksandr Polozov, Matthew Richardson
arXiv: https://arxiv.org/abs/1911.04942
ACL 2020, Seattle, USA
Rank #1 on Spider Dataset: https://yale-lily.github.io/spider
SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization
Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao, Tuo Zhao
https://arxiv.org/abs/1911.03437
ACL 2020, Seattle, USA
Rank #1 on GLUE on Dec 2019.
The Microsoft Toolkit of Multi-Task Deep Neural Networks for Natural Language Understanding
Xiaodong Liu, Yu Wang, Jianshu Ji, Hao Cheng, Xueyun Zhu, Emmanuel Awa, Pengcheng He, Weizhu Chen, Hoifung Poon, Guihong Cao and Jianfeng Gao
https://arxiv.org/abs/2002.07972
ACL 2020 Demo, Seattle, USA
On the Variance of Adaptive Learning Rate and Beyond
Liyuan Liu, Haoming Jiang, pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao and Jiawei Han
https://arxiv.org/abs/1908.03265
ICRL 2020, Ethiopia
Code: https://github.com/LiyuanLucasLiu/RAdam
Improving Multi-Task Deep Neural Networks via Knowledge Distillation for Natural Language Understanding
Xiaodong Liu, Pengcheng He, Weizhu Chen and Jianfeng Gao
https://arxiv.org/abs/1904.09482
Code will be released: github.com/namisan/mt-dnn
Rank #1 on GLUE TASK on April 1, 2019, Leaderboard.
Achieved Human Parity on June 6, 2019, Leaderboard.
Multi-Task Deep Neural Networks for Natural Language Understanding
Xiaodong Liu*, Pengcheng He*, Weizhu Chen and Jianfeng Gao
https://arxiv.org/abs/1901.11504
Code: github.com/namisan/mt-dnn
Rank #1 on GLUE TASK as Dec 18, 2018 (1.8 Absolute Improvement over BERT) Leaderboard.
Rank #1 on SNLI as Feb 7, 2019, Leaderboard.
Rank #1 on SciTail as Jan 19, 2019, Leaderboard.
ACL 2019, Florence, Italy.
Unified Language Model Pre-training for Natural Language Understanding and Generation
Li Dong, Nan Yang, Wenhui Wang, Furu Wei, Xiaodong Liu, Yu Wang, Jianfeng Gao, Ming Zhou and Hsiao-Wuen Hon
https://arxiv.org/abs/1905.03197
NeurIPS 2019, Vancouver, Canada.
Adversarial Domain Adaptation for Machine Reading Comprehension,
Huazheng Wang, Zhe Gan, Xiaodong Liu, Jingjing Liu, Jianfeng Gao, Hongning Wang
EMNLP 2019, Hong Kong, China.
A Hybrid Retrieval-Generation Neural Conversation Model
Liu Yang, Junjie Hu, Minghui Qiu, Chen Qu, Jianfeng Gao, W Bruce Croft, Xiaodong Liu, Yelong Shen and Jingjing Liu
CIKM 2019, Beijing, China.
Stochastic Answer Networks for Natural Language Inference
Xiaodong Liu, Kevin Duh and Jianfeng Gao
http://arxiv.org/abs/1804.07888
Code: github.com/namisan/mt-dnn
Rank #1 on MS-MARCO Ranking as Jan 22, 2019, Leaderboard.
Conversing by Reading: Contentful Neural Conversation with On-demand Machine Reading
Lianhui Qin, Michel Galley, Chris Brockett, Xiaodong Liu, Xiang Gao, Bill Dolan, Yejin Choi and Jianfeng Gao
ACL 2019, Florence, Italy.
Multi-task Learning with Sample Re-weighting for Machine Reading Comprehension
Yichong Xu, Xiaodong Liu, Yelong Shen, Jingjing Liu and Jianfeng Gao
2019 Conference of the North American Chapter of the Association for Computational Linguistics-Human Language Technologies (NAACL), Minneapolis, USA, 2019.
Cyclical Annealing Schedule: A Simple Approach to Mitigate KL Vanishing
Hao Fu, Chunyuan Li, Xiaodong Liu, Jianfeng Gao, Asli Celikyilmaz and Lawrence Carin
2019 Conference of the North American Chapter of the Association for Computational Linguistics-Human Language Technologies (NAACL), Minneapolis, USA, 2019.
Weakly-Supervised Deep Structured Semantic Models for Commonsense Reasoning
Shuohang Wang, Sheng Zhang, Yelong Shen, Xiaodong Liu, Jingjing Liu and Jianfeng Gao
2019 Conference of the North American Chapter of the Association for Computational Linguistics-Human Language Technologies (NAACL), Minneapolis, USA, 2019.
ReCoRD: Bridging the Gap between Human and Machine Commonsense Reading Comprehension
Sheng Zhang, Xiaodong Liu, Jingjing Liu and Jianfeng Gao.
arXiv preprint arXiv:1810.12885
Leaderboard: https://sheng-z.github.io/ReCoRD-explorer/
Navigating with Graph Representations for Fast and Scalable Decoding of Neural Language Models
Minjia Zhang, Xiaodong Liu, Wenhan Wang, Jianfeng Gao and Yuxiong He
NIPS 2018,
Stochastic Answer Networks for Machine Reading Comprehension
Xiaodong Liu, Yelong Shen, Kevin Duh and Jianfeng Gao
ACL 2018, Code
https://arxiv.org/abs/1712.03556.
Language-Based Image Editing with Recurrent Attentive Models
Jianbo Chen, Yelong Shen, Jianfeng Gao, Jingjing Liu and Xiaodong Liu,
CVPR, 2018.
https://arxiv.org/abs/1711.06288
Towards Human-level Machine Reading Comprehension: Reasoning and Inference with Multiple Strategies
Yichong Xu, Jingjing Liu, Jianfeng Gao, Yelong Shen and Xiaodong Liu,
arXiv preprint 2017. https://arxiv.org/abs/1711.04964
An Empirical Analysis of Multiple-Turn Reasoning Strategies in Reading Comprehension Tasks,
Yelong Shen, Xiaodong Liu, Kevin Duh and Jianfeng Gao,
Proceedings of the Eighth International Joint Conference on Natural Language Processing, 2017
Lexical Simplification with the Deep Structured Similarity Model
Lis Pereira, Xiaodong Liu and John Lee,
Proceedings of the Eighth International Joint Conference on Natural Language Processing, 2017
Representation Learning Using Multi-Task Deep Neural Networks
Xiaodong Liu, Jianfeng Gao, Xiaodong He, Li Deng, Kevin Duh and Ye-Yi Wang,
US Patent App. 14/811,808
Representation Learning Using Multi-Task Deep Neural Networks for Semantic Classification and Information Retrieval
Xiaodong Liu, Jianfeng Gao, Xiaodong He, Li Deng, Kevin Duh and Ye-Yi Wang
2015 Conference of the North American Chapter of the Association for Computational Linguistics-Human Language Technologies
(NAACL-HLT).
A Hybrid Ranking Approach to Chinese Spelling Check
Xiaodong Liu*, Fei Cheng*, Kevin Duh and Yuji Matsumoto (*equal contribution)
ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), to publish.
Multilingual Topic Models for Bilingual Dictionary Extraction
Xiaodong Liu, Kevin Duh and Yuji Matsumoto,
ACM Transactions on Asian Language Information Processing (TALIP), Vol.14, No.3, May 2015.
Learning Character Representations for Chinese Word Segmentation
Xiaodong Liu, Kevin Duh, Tomoya Iwakura and Yuji Matsumoto.
NIPS 2014 Workshop on Modern Machine Learning and Natural Language Processing.
A Hybrid Chinese Spelling Correction System Using Language Model and Statistical Machine Translation with Reranking
Xiaodong Liu, Fei Cheng, Yanyan Luo, Kevin Duh and Yuji Matsumoto
7th SIGHAN Workshop on Chinese Language Processing, Nagoya, Japan, 2013.
Topic Models + Word Alignment = A Flexible Framework for Extracting Bilingual Dictionary from Comparable Corpus
Xiaodong Liu, Kevin Duh and Yuji Matsumoto
Seventeenth Conference on Computational Natural Language Learning (CoNLL13),Sofia, Bulgaria, 2013.
A Novel Joint Model of Word Alignment and Hierarchical Dirichlet Process for Statistical Machine Translation
Xiaodong Liu
9th Conference on Bayesian Nonparametrics, Poster section, Amsterdam, Netherlands, 2013.
Use Relative Weight to Improve the kNN for Unbalanced Text Category.
Xiaodong Liu, Caixia Yuan and Fuji Ren
6th IEEE International Conference on Natural Language Processing and Knowledge Engineering, Beijing, China, 2010.
Language
Chinese: native
Japanese: JLPT N1
English