Dongxu Zhang

Email: dongxuzhang@cs.umass.edu

I am a PhD student in Computer Science, Umass Amherst.

Previously I received my bachelor and master degree from Beijing University of Posts and Telecommunications.

News:

  • Internship at Amazon (2018.6-2018.8), I worked with Subho and Luna.

Selected Publications

[3] Dongxu Zhang, Boliang Zhang, Xiaoman Pan, Xiaocheng Feng, Heng Ji and Weiran Xu. "Bitext Name Tagging for Cross-lingual Entity Annotation Projection" . In COLING 2016 [pdf]

[2] Dongxu Zhang, and Dong Wang.“Relation Classification via Recurrent Neural Network”. arXiv:1508.01006. [pdf]

[1] Dongxu Zhang, Bin Yuan, Dong Wang, Rong Liu. “Joint Semantic Relevance Learning with Text Data and Graph Knowledge”, In the third Workshop of CVSC, ACL 2015. [pdf]

(Find more papers at Google scholar)

Experience

  • Amazon.com, Inc. (6/2018-8/2018)

Mentor: Subhabrata Mukherjee and Xin Luna Dong.

Advisor: Sujian Li

  • Naturali, Beijing, China (2/2017-4/2017)

Mentor: Dekang Lin

  • Blender Lab, Rensselaer Polytechnic Institute (4/2016-6/2016)

Advisor: Heng Ji

  • CSLT, Tsinghua University (10/2014-3/2016)

Advisor: Dong Wang

  • PRIS Lab, Beijing University of Posts and Telecommunications (9/2013-09/2014)

Advisor: Weiran Xu

  • Samsung Telecommunication R&D Center, Beijing, China (5/2013-9/2013)

Mentor: Xiaojie Yu

(Add me at Linkedin)

Code & Data

KBP37

KBP37 is a dataset for sentence level relation classification task. Compared with SemEval-2010 Task 8, this dataset has longer relation mentions in average length. See my paper for more details. The dataset is a revision of MIML-RE dataset. Please cite this paper and MIML-RE for the use of research purpose. In this dataset, an acceptable system should reach at least 50% in F-1 value. (Much harder than SemEval-2010 Task 8)

Animal-143

This dataset contains semantic relatedness scores between 143 pairs of animal names. The score of each pair was evaluated by 9 people in CSLT, and the average is used. This dataset can be regarded as a domain-specific evaluation dataset. (And the well-known WordSimilarity-353 dataset can be regarded as a general-domain dataset.) In this dataset, an acceptable system should reach at least 0.7 in spearman coefficient.

Sentence classification

Here you can find CNN+Pooling code for relation classification, sentiment classification and answer selection implemented with Tensorflow.

(Download more resources from Github)

Misc.

NLP / DM / ML Conference Deadlines

AAAI 2019 August 30 / September 5, 2018

ICLR 2019 September 27, 2018

WWW 2019 October 29 / November 05, 2018

AKBC 2019 Nov 16, 2018

NAACL 2019 December 3 / December 10, 2018

ICML 2019 Jan 7th, 2019

KDD 2019 Feb 3, 2019

ACL 2019 March 4, 2019

IJCAI 2019 Feb 19th/ 25th, 2019

EMNLP-IJCNLP 2019 May 21st

CIKM 2019

NIPS 2019