Minghui's Home Page
Minghui Qiu (邱明辉)
PhD@SMU, Senior Algorithm Expert@Alibaba Group
Currently, I'm a senior algorithm expert in Alibaba cloud, working on deep learning and large language models (LLMs) for many NLP tasks. I'm responsible for building the NLP and Transfer learning toolkits, namely EasyNLP, EasyTransfer for Alibaba cloud, supporting 10+ business units and 20+ applications in Alibaba group.
I held a Ph.D degree in School of Information Systems, Singapore Management University, under the supervision of Associate Prof. Jing Jiang and Prof Lim Ee-peng. My research interests mainly include Text Mining and Analytic, Deep Learning, Reinforcement Learning and Transfer Learning. From 2013 to 2014, I was visiting Language Technologies Institute, Carnegie Mellon University, working with Noah Smith and Alex Smola. In the summer of 2014, I worked as an intern at Google Inc., Mountain View, CA with Amr Ahmed and Yuan Wang.
We have released the EasyNLP Toolkit to make it easy to develop NLP applications. Welcome to follow, star, or contribute~
Experience
Senior Algorithm Expert, Alibaba Group, China , 2015.8 – Now.
AI platforms
Building NLP toolkits PAI-EasyTransfer (Tensorflow) and EasyNLP Toolkit (PyTorch)
PAI-ModelZoo (FashionBERT for cross-modality search, ASR-Robust Pre-training, Knowledge-enhanced BERT etc.)
Large scale model pre-training (2B+ model parameters)
Few-shot Learning for Pre-trained Models
Knowledge distillation toolkit for landing large pre-trained models
Cross-modality Pre-training: FashionBERT for cross-modality search, ARTIST for generation
Building Ali AI Agent for RL (A3gent)
Developing Neural network models for Platform of AI (PAI)
Applications
Pretrained models for cross-modality generations
Cross-modality pretrained models for search and recommendations
Text retrieval applications
Build core question answering modules in AliMe Assistant Bot
End-to-end Trainable TaskBot based on Deep Reinforcement Learning
Deep Reinforcement Learning for E-commerce
Intern Google, Inc., Mountain View, CA, USA , 2014.5 - 2014.8
Summer Intern on Random Features for Large-Scale Kernel Machines.
Visiting Scholar Carnegie Mellon University, Pittsburgh, PA, USA 2013.5 - 2014.5
PhD Training Residency in Natural Language Processing and Machine Learning.
Education
Postdoc, Zhejiang University & Alibaba Group, 2018 to 2020.
Advisor: Cai Deng (Zhejiang University), Jingren Zhou (Alibaba Group)
Topic: Towards Building Scalable Transfer Learning for E-commerce
PhD, Singapore Management University, Aug 2010 to July 2015.
Advisor: Jing Jiang
Topic: Mining User Viewpoints in Online Discussions
PhD Overseas Training Residency, Carnegie Mellon University, Aug 2013 to June 2014.
Advisor: Noah Smith and Alex Smola
Topic: User Stance Prediction in Online Debates; Modeling Aspects, Ratings, and Sentiments for Movie Rec.
Selected Publications
(*=corresponding author)
arXiv preprint. VALLEY: Video Assistant with Large Language Model Enhanced abilitY. 2024
arXiv preprint. FashionLOGO: Prompting Multimodal Large Language Models for Fashion Logo Embeddings. 2024
arXiv preprint. A Comprehensive Analysis of Information Leakage in Deep Transfer Learning. 2023.
WSDM 2023. Making Pre-trained Langue Model End-to-end Few-shot Learners with Contrastive Prompt Tuning. 2023
ICASSP 2023. Boosting Prompt-based Few-shhot Learners throught OOD Knowledge distillation. 2023.
ICASSP 2023. Few-shot Knowledge Distillation with Dual Contrastive Learning. 2023.
EMNLP 2022. EasyNLP: A Comprehensive and Easy-to-use Toolkit for Natural Language Processing. To appear in EMNLP 2022.
EMNLP 2022. ARTIST: A Transformer-based Chinese Text-to-Image Synthesizer Digesting Linguistic and World Knowledge. To appear in EMNLP 2022.
EMNLP 2022. Towards Unified Prompt Tuning for Few-shot Text Classification, To appear in EMNLP 2022.
EMNLP 2022. KECP: Knowledge Enhanced Contrastive Prompting for Few-shot Extractive Question Answering, To appear in EMNLP 2022.
EMNLP 2022. Knowledge Prompting in Pre-trained Language Model for Natural Language Understanding, To appear in EMNLP 2022.
AAAI 2022. DKPLM: Decomposable Knowledge-enhanced Pre-trained Language Model for Natural Language Understanding. To Appear.
EMNLP 2021. Meta Distant Transfer Learning for Pre-trained Language Models. To Appear as Full paper.
EMNLP 2021. TransPrompt: Towards an Automatic Transferable Prompting Framework for Few-shot Text Classification. To Appear as Full paper.
CIKM 2021. EasyTransfer -- A Simple and Scalable Deep Transfer Learning Platform for NLP Applications. To Appear.
CIKM 2021. HORNET: Enriching Pre-trained Language Representations with Heterogeneous Knowledge Sources. To Appear.
CIKM 2021. Learning to Expand: Reinforced Response Expansion for Information-seeking Conversations. To Appear.
KDD 2021. MeLL: Large-scale Extensible User Intent Classification for Dialogue Systems with Meta Lifelong Learning. Chengyu Wang, Haojie Pan, et al., KDD 2021. (Corresponding author)
ACL 2021. Meta-KD: A Meta Knowledge Distillation Framework for Language Model Compression across Domains, Haojie Pan, Chengyu Wang, Minghui Qiu*, et al., to appear, ACL 2021.
WWW 2021. Cross-domain Knowledge Distillation for Retrieval-based Question Answering Systems. To appear, WWW.
AAAI 2021. Reinforced History Backtracking for Conversational Question Answering. Minghui Qiu, Xinjing Huang, et al., To appear, AAAI.
AAAI 2021. Learning to Augment for Data-Scarce Domain BERT Knowledge Distillation. Lingyun Feng, Minghui Qiu, Yaliang Li, Hai-Tao Zheng, Ying Shen, To appear, AAAI.
AAAI 2021. KEML: A Knowledge-Enriched Meta-Learning Framework for Lexical Relation Classification, Chengyu Wang, Minghui Qiu*, Jun Huang, Xiaofeng He. AAAI.
EMNLP 2020. Meta Fine-Tuning Neural Language Models for Multi-Domain Text Mining. Chengyu Wang, Minghui Qiu*, Jun Huang, Xiaofeng He, EMNLP 2020. Full Paper.
IJCAI 2020. AdaBERT: Task-adaptive BERT Compression with Neural Architecture Search. Daoyuan Chen, Yaliang Li, Minghui Qiu, Zhen Wang, Bofang Li, Bolin Ding, Hongbo Deng, Jun Huang, Wei Lin, Jingren Zhou. IJCAI 2020.
SIGIR 2020. FashionBERT: Text and Image Matching for Fashion Domain with Adaptive Loss, Dehong Gao, Linbo Jin, Ben Chen, Minghui Qiu, Peng Li, Yi Wei, Yi Hu, Hao Wang, Industry Track, Full paper.
SIGIR 2020. Open-Retrieval Conversational Question Answering. Chen Qu, Liu Yang, Cen Chen, Minghui Qiu, W. Bruce Croft, Mohit Iyyer, To appear in SIGIR. Full paper.
SIGIR 2020. Global Context Enhanced Graph Nerual Networks for Session-based Recommendation. Ziyang Wang , Wei Wei, Gao Cong, Xiaoli Li, Xianling Mao, and Minghui Qiu, To appear in SIGIR. Full paper.
ACM MM2020. One-shot Text Field Labeling using Attention and Belief Propagation for Structure Information Extraction. Mengli Cheng, Minghui Qiu*, Xing Shi, Jun Huang, Wei Lin, To appear, Full Oral paper.
CIKM 2019. A Hybrid Retrieval-Generation Neural Conversation Model. Liu Yang, Junjie Hu, Minghui Qiu, Chen Qu, Jianfeng Gao, W. Bruce Croft, Xiaodong Liu, Yelong Shen, and Jingjing Liu. To appear in CIKM. Full paper.
CIKM 2019. Attentive History Selection for Conversational Question Answering. Chen Qu, Liu Yang, Minghui Qiu, et al., To appear in CIKM. Full paper.
CIKM 2019. Cross-domain Attention Network with Wasserstein Regularizers for E-commerce Search. Minghui Qiu, Bo Wang, et al., To appear in CIKM. Full paper.
KDD 2019. A Minimax Game for Instance based Selective Transfer Learning. Bo Wang, Minghui Qiu*, Xisen Wang, Yaliang Li, Yu Gong, Xiaoyi Zeng, Jun Huang, Bo Zheng, Deng Cai, and Jingren Zhou, To appear in KDD.
SIGIR 2019. BERT with History Modeling for Conversational Question Answering. Chen Qu, Liu Yang, Minghui Qiu, W. Bruce Croft, Yongfeng Zhang, Mohit Iyyer, In Proceedings of the 42th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 2019, to appear.[Code & Data & Bibtext]
WWW 2019. Multi-Domain Gated CNN for Review Helpfulness Prediction. To appear in The World Wide Web Conference (WWW-19), San Francisco, USA, May, 2019. (Short)
WSDM 2019. Learning to Selectively Transfer: Reinforced Transfer Learning for Deep Text Matching. Chen Qu, Feng Ji, Minghui Qiu*, Liu Yang, Zhiyu Min, Haiqing Chen, Jun Huang and W. Bruce Croft. To appear in Proceedings of the 12th ACM International Conference on Web Search and Data Mining (WSDM 2019), Melbourne, Australia, February 11-15, 2019. Full Oral Paper. Acceptance rate=16% (84 out of 511).
ACL 2018. Transfer Learning for Context-Aware Question Matching in Information-seeking Conversations in E-commerce. Minghui Qiu, Liu Yang, Feng Ji, Wei Zhou, Weipeng Zhao, Jun Huang, Haiqing Chen, W. Bruce Croft,Wei Lin. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018), Melbourne, Australia, July 15-20, 2018.
SIGIR 2018. Response Ranking with Deep Matching Networks and External Knowledge in Information-seeking Conversation Systems. Liu Yang, Minghui Qiu, Chen Qu, Jiafeng Guo, Yongfeng Zhang, W. Bruce Croft, Jun Huang, Haiqing Chen. In Proceedings of the 41th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018), Ann Arbor, Michigan, U.S.A. July 8-12, 2018. [Code][Data][Slides][PPT][Bibtex]
SIGIR 2018. Analyzing and Characterizing User Intent in Information-seeking Conversations. Short Paper. To appear.[MSDialog dataset]
WSDM 2018. Modelling Domain Relationships for Transfer Learning on Retrieval-based Question Answering Systems in E-commerce. Jianfei Yu, Minghui Qiu*, Jing Jiang, Shuangyong Song, Jun Huang, Wei Chu and Haiqing Chen. WSDM 2018. [Code][Bibtex]
IJCAI 2018. IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Oriented Scene Text Detection. To appear.
Book Chapter. Reinforcement Learning Beyond Games: To Make a Difference in Alibaba. Book Chapter, 2018.
ICDM 2017. A Short-Term Rainfall Prediction Model using Multi-Task Convolutional Neural Networks. Minghui Qiu, Peilin Zhao, Ke Zhang, Xing Shi, Xiaoguang Wang, Jun Huang, and Wei Chu, ICDM, regular paper.
CIKM 2017. A Communication Efficient Parallel DBSCAN Algorithm based on Parameter Server. Xu Hu, Jun Huang, Minghui Qiu, et al., ACM International Conference on Information and Knowledge Management.
CIKM 2017. AliMe Assist: An Intelligent Assistant for Creating an Innovative E-commerce Experience, Feng-Lin Li, Minghui Qiu, Haiqing Chen, Xiongwei Wang, Xing Gao, Jun Huang, et al., ACM International Conference on Information and Knowledge Management. Best Demo Award.
ACL 2017. AliMe Chat: a Sequence to Sequence and Rerank Based Chatbot Engine. Minghui Qiu, Feng-Lin Li, Siyu Wang, Xing Gao, Yan Chen, Weipeng Zhao, Haiqing Chen, Jun Huang, Wei Chu. Annual Meeting of the Association for Computational Linguistics.
TIST 2016. Personalized Microtopic Recommendation on Microblogs. Yang Li, Jing Jiang, Ting Liu, and Minghui Qiu. In the Transactions on Intelligent Systems and Technology.
SDM 2015. Modeling User Arguments, Interactions, and Attributes for Stance Prediction in Online Debate Forums. Minghui Qiu, Yanchuan Sim, Noah A. Smith, and Jing Jiang. 2015 SIAM International Conference on Data Mining. [Appendix]
ACL 2015. Semantic Analysis and Helpfulness Prediction of Text for Online Product Reviews. Yinfei Yang, Yaowei Yan, Minghui Qiu, Forrst S. Bao, short, Beijing, China. [Data]
KDD 2015. FaitCrowd: Fine Grained Truth Discovery for Crowdsourced Data Aggregation. Fenglong Ma, Yaliang Li, Qi Li, Minghui Qiu, Jing Gao, Shi Zhi, Lu Su, Bo Zhao, Heng Ji, and Jiawei Han. In the 21th ACM SIGKDD Conference on Knowledge Discovery and Data Mining.
KDD 2014. Jointly Modeling Aspects, Ratings and Sentiments for Movie Recommendation (JMARS). Qiming Diao, Minghui Qiu, Chao-Yuan Wu, Alexander J. Smola, Jing Jiang, and Chong Wang. In the 20th ACM SIGKDD Conference on Knowledge Discovery and Data Mining.
Coling 2014. Generating supplementary travel guides from social media. Liu Yang, Jing Jiang, Lifu Huang, Minghui Qiu, and Lizi Liao. In the 25th International Conference on Computational Linguistics.
NAACL 2013. Mining User Relations from Online Discussions using Sentiment Analysis and Probabilistic Matrix Factorization. Minghui Qiu, Liu Yang, and Jing Jiang. In Proceedings of Annual Conference of the North American Chapter of the Association for Computational Linguistics. [Bibtex][Code][Data]
NAACL 2013. A Latent Variable Model for Viewpoint Discovery from Threaded Forum Posts. Minghui Qiu, and Jing Jiang. In Proceedings of Annual Conference of the North American Chapter of the Association for Computational Linguistics.
CIKM 2013. Modeling Interaction Features for Debate Side Clustering. Minghui Qiu, Liu Yang, and Jing Jiang. In Proceedings of the 22nd ACM International Conference on Information and Knowledge Management.
CIKM 2013. CQARank: Jointly Model Topics and Expertise in Community Question Answering. Liu Yang, Minghui Qiu, Swapna Gottipati, Feida Zhu, Jing Jiang, Huiping Sun, and Zhong Chen. In Proceedings of the 22nd ACM International Conference on Information and Knowledge Management. [Code]Top 3 Cited Papers in CIKM'13
SDM 2013. It's Not What We Say But How We Say Them: LDA-based Behavior-Topic Model. Minghui Qiu, Feida Zhu, and Jing Jiang. 2013 SIAM International Conference on Data Mining (SDM'13), Austin, Texas, USA. [Appendix][B-LDA Code][Slides]
EMNLP 2013. Learning Topics and Positions from Debatepedia. Swapna Gottipati, Minghui Qiu, Yanchuan Sim, Jing Jiang, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. [Bibtex][Appendix][Data][Slides]
SocInfo 2013. Predicting User's Political Party using Ideological Stances. Swapna Gottipati, Minghui Qiu, Liu Yang, Feida Zhu, and Jing Jiang. In Proceedings of the 5th International Conference on Social Informatics. Best Paper Runner-up.
AIRS 2012. Query-Oriented Keyphrase Extraction. Minghui Qiu, Yaliang Li, and Jing Jiang. In Proceedings of the Asia Information Retrieval Societies Conference, pp64-75.
PRL 2010. A Fast Divisive Clustering Algorithm Using An Improved Discrete Particle Swarm Optimizer. Liang Feng, Minghui Qiu, Yu-Xuan Wang, Qiao-Liang Xiang, Yin-Fei Yang, and Kai Liu. Pattern Recognition Letters, Vol. 31, No. 11, pp. 1216 - 1225.
Dissertation. Mining User Viewpoints in Online Discussions, Minghui Qiu, Ph.D. thesis, School of Information Systems, Singapore Management University, to appear in Aug 2015.
See also: [Google Scholar] [DBLP]
Talks
Deep Learning with PAI -- a Case Study of AliMe, Deep Learning Summit, Singapore, 2017.
Tensorflow and Its Application in AliMe - An Intelligent Assistant, Tensorflow symposium, Alibaba, 2016.
Latent Variable Models for Viewpoint Discovery in Online Discussions, ISP AI Seminar, University of Pittsburgh, 2014.
Modeling Interaction Features for Debate Side Clustering. CIKM, Burlingame, CA. 2013
Mining User Relations from Online Discussions using Sentiment Analysis and Probabilistic Matrix Factorization, NAACL, Atlanta, GA, 2013.
A Latent Variable Model for Viewpoint Discovery from Threaded Forum Posts, NAACL, Atlanta, GA. 2012
A Matrix Factorization Technique with Trust Propagation for Recommendation in Social Networks, LARC Research Seminar, Singapore. 2012
Query-oriented Keyphrase Extraction, AIRS, Tianjin, China. 2012
Awards
AliStar, Alibaba Group, 2015
SocInfo Best Paper Runner-up, 2013.
CIKM Student Travel Award, 2013.
PhD Overseas Training Residency at CMU, LARC Scholarship.
Singapore Management University Postgraduate Research Scholarship, 2010.
Skills
Language Skills: Mandarin (mother tongue), English
Programming Languages: Python, Java, C++, Matlab
"Science is a differential equation. Religion is a boundary condition.'' -- Allan Turing