Chenguang Wang

I am now an applied scientist in Amazon AI working with Dr. Mu Li and Dr. Alex Smola. Before that, I was a research staff member in IBM Research-Almaden. I received my Ph.D. degree from Peking University. My supervisor was Dr. Ming Zhang. I was also a joint Ph.D. student at University of Illinois at Urbana-Champaign. I was supervised by Dr. Jiawei Han at Data Mining Research Group.

Address: 2100 University Ave, East Palo Alto, CA

Email: wangcg[dot]pku[at]gmail.com

What's New

  • Feb 2018: Join Amazon AI!

Research Interests

Machine Learning, Natural Language Understanding, Knowledge Discovery. The goal of my research is to help real world applications in human daily life with better intelligence. As an essential towards this goal, I am working on MXNet and Gluon to make deep learning as simple as possible!

[NEW] We are working on GluonNLP, which aims to enable easy text preprocessing, datasets loading and neural models building to help you speed up your Natural Language Processing (NLP) research. You are more than welcome to try it out [code][documentation]! We hope GluonNLP could be your choice of deep learning for NLP.

Publications

    • Chenguang Wang, Yangqiu Song, Haoran Li, Yizhou Sun, Ming Zhang, and Jiawei Han. Distant Meta-Path Similarities for Text-Based Heterogeneous Information Networks. Proc. 2017 ACM Int. Conf. on Information and Knowledge Management (CIKM'17).
    • Chenguang Wang, Alan Akbik, Laura Chiticariu, Yunyao Li, Fei Xia, and Anbang Xu. Crowd-in-the-loop: A hybrid approach for annotating semantic roles. Proc. 2017 Conf. on Empirical Methods on Natural Language Processing (EMNLP'17).
    • Chenguang Wang, Laura Chiticariu, and Yunyao Li. Active learning for black-box semantic role labeling with neural factors. Proc. 2017 Int. Joint Conf. on Artificial Intelligence (IJCAI'17).
    • He Jiang, Yangqiu Song, Chenguang Wang, Ming Zhang, and Yizhou Sun. Semi-supervised learning over heterogeneous information networks by ensemble of meta-graph guided random walks. Proc. 2017 Int. Joint Conf. on Artificial Intelligence (IJCAI'17).
    • Chenguang Wang, Doug Burdick, Laura Chiticariu, Rajasekar Krishnamurthy, Yunyao Li, and Huaiyu Zhu. Towards re-defining relation understanding in financial domain. Proc. of 2017 ACM SIGMOD Int. Conf. on Management of Data Workshop (SIGMOD'17 Workshop). [Video]
    • Yuxin Chen, and Chenguang Wang. HINE: Heterogeneous Information Network Embedding. Proc. 2017 Int. Conf. on Database Systems for Advanced Applications (Dasfaa'17).
    • Chenguang Wang, Yangqiu Song, Dan Roth, Ming Zhang, and Jiawei Han. World Knowledge as Indirect Supervision for Document Clustering. ACM Transactions on Knowledge Discovery from Data (TKDD'16).
    • Chenguang Wang, Yizhou Sun, Yanglei Song, Jiawei Han, Yangqiu Song, Lidan Wang, and Ming Zhang. RelSim: Relation Similarity Search in Schema-Rich Heterogeneous Information Networks. Proc. 2016 SIAM Int. Conf. on Data Mining (SDM'16).
    • Chenguang Wang, Yangqiu Song, Haoran Li, Ming Zhang, and Jiawei Han. Text Classification with Heterogeneous Information Network Kernels. Proc. 2016 AAAI Conf. on Artificial Intelligence (AAAI'16).
    • Chenguang Wang, Yangqiu Song, Haoran Li, Ming Zhang, and Jiawei Han. KnowSim: A Document Similarity Measure on Structured Heterogeneous Information Networks. Proc. 2015 IEEE Int. Conf. on Data Mining (ICDM'15).
    • Chenguang Wang, Yangqiu Song, Ahmed El-Kishky, Dan Roth, Ming Zhang, and Jiawei Han. Incorporating World Knowledge to Document Clustering via Heterogeneous Information Networks. Proc. 2015 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD'15). [Video]
    • Chenguang Wang, Yangqiu Song, Dan Roth, Chi Wang, Jiawei Han, Heng Ji, and Ming Zhang. Constrained Information-Theoretic Tripartite Graph Clustering to Identify Semantically Similar Relations. Proc. 2015 Int. Joint Conf. on Artificial Intelligence (IJCAI'15).
    • Yangqiu Song, Chenguang Wang, Ming Zhang, and Hailong Sun. Spectral Label Refinement for Noisy and Missing Text Labels. Proc. 2015 AAAI Conf. on Artificial Intelligence (AAAI'15).
    • Quan Liu, Chenguang Wang, and Ming Zhang. Modeling Domain Influence in Heterogeneous Networks. Proc. 2014 ACM Int. Conf. on Web Search and Data Mining Workshop on Diffusion Networks and Cascade Analytics (WSDM'14 Workshop).
    • Chenguang Wang, Nan Duan, Ming Zhou, and Ming Zhang. Paraphrasing Adaptation for Web Search Ranking. Proc. 2013 Annual Meeting of the Association for Computational Linguistics (ACL'13).
    • Chi-Ho Li, Shujie Liu, Chenguang Wang, and Ming Zhou. ENGtube: an Integrated Subtitle Environment for ESL. MT Summit XIII: the Thirteenth Machine Translation Summit (MTSummit'11).
    • Chenguang Wang, Chongwen Wang, and Jie Bing. Future-related Future Prediction System by Query Subtopic Analysis Based on Chinese News Web Pages. Proc. 2010 Sciencepaper Online (in Chinese).
    • Chenguang Wang, and Huaizhi Yan. Study of Cloud Computing Security Based on Private Face Recognition. Proc. 2010 Int. Conf. on IEEE Computational Intelligence and Software Engineering (CiSE'10). [Paper]

Research Experience

  • Applied Scientist, Amazon AI.
    • Feb 2018–Present.
    • Working on deep learning for NLP with MXNet and Gluon.
  • Research Staff Member, IBM Research-Almaden.
    • Oct 2016–Jan 2018.
    • Working on leveraging both expert and crowd intelligence to improve shallow semantic parsing.
    • Published results in EMNLP 2017 and IJCAI 2017.
  • Visiting Researcher, Machine Learning Group, Baidu Inc..
    • Jul 2016–Oct 2016.
    • Worked on deep learning based dialogue system and natural language generation.
  • Research Assistant, Institute of Networking and Information Systems, Peking University.
    • Sept 2011–Jul 2016.
    • Worked on real world problems relevant but not limited to:
      • World knowledge specified machine learning.
      • Knowledge acquisition from structured and unstructured data.
      • Social network analysis and modeling.
    • Research results are published in AAAI 2016, KDD 2015, ICDM 2015, AAAI 2015 and WSDM 2014.
  • Joint Ph.D. Student, Database and Information Systems Group, University of Illinois at Urbana-Champaign.
    • Sept 2013–Mar 2015.
    • Researched clustering semantically similar relations in both structured knowledge graph and open information extraction.
    • Studied searching similar relations in schema-rich heterogeneous information networks.
    • Published results in IJCAI 2015 and SDM 2016.
  • Research Intern, Natural Language Computing Group, Microsoft Research Asia.
    • Nov 2010–Jul 2011 and Oct 2012–Apr 2013.
    • Improved the performance of web search by optimizing paraphrasing techniques.
    • Transferred into Microsoft search engine—Bing.
    • Established an integrated subtitle environment for ESL.
    • Research results are published in ACL 2013 and MT Summit 2011.
  • Research Intern, State Key Laboratory of Intelligent Technology and System, Tsinghua University.
    • Mar 2010–Sept 2010.
    • Built a community-based question-answering system QUANTA.
    • Proposed and implemented an efficient answer re-ranking algorithm.

Invited Talks

    • Text Mining with Heterogeneous Information Networks. In Disney Research, Pittsburgh, USA, May 2016.
    • Text Mining with Heterogeneous Information Networks. In Stanford University, California, USA, Feb 2016.
    • Text Mining with Heterogeneous Information Networks. In Yahoo! Research, California, USA, Feb 2016.
    • Text Mining with Heterogeneous Information Networks. In Carnegie Mellon University, Pittsburgh, USA, Nov 2015.
    • World Knowledge as Indirect Supervision for Machine Learning. In University of California, Santa Barbara, California, USA, Nov 2015.
    • World Knowledge as Indirect Supervision for Machine Learning. In Microsoft Research Asia, Beijing, China, Nov 2015.
    • Incorporating World Knowledge to Document Clustering via Heterogeneous Information Networks. In the 13rd Chinese Workshop on Machine Learning and Applications, Nanjing, China, Nov 2015.

Leadership Experience

  • Co-founder, Singularity Hedge Fund Inc..
    • Sept 2011–Aug 2013; Co-founders: Ming Lei (One of the Four Founding Team Members of Baidu Inc.), Limin Zhou (Chief Architect at Baidu Inc.).
      • Baidu Inc. is a Chinese web services company including the largest Chinese language-search engine.
    • Worked as CTO on designing new trading strategies based on proposed novel computational models, and mining valuable market information from big data, targeting U.S. stock and future oriented markets.
    • The team included eight excellent students (top 10 students majored in mathematics or computer science) from Peking University and Tsinghua University.
    • Singularity Hedge Fund Inc. owns at least $200 million in assets with average 20% annual return, and captured 25% growth in 2014.
  • Main contributor, Peking University.
    • May 2013–Jul 2013; Project: Domain Influence-based Expert Recommendation in Microblog.
    • Designed the milestones and schedule, developed domain-dependent user influence model.
    • Three undergraduate students worked with me. With my supervision, they become more excellent, and currently either pursue Ph.D. or work in the United States.
    • The system is online in Weibo.
      • Weibo is the largest Microblogging website in China.
  • Main contributor, Tsinghua University.
    • Apr 2010–Jun 2010; Project: Chatterbot using Community-based Question Answering.
    • Developed an MSN chatterbot with answers from a Chinese community-based question answering system QUANTA.
    • As an undergraduate at that time, I worked with two Ph.D. students and was in charge of the project.