Dongxu Zhang
Email: dongxuzhang@cs.umass.edu
I am a PhD student in Computer Science, Umass Amherst.
Previously I received my bachelor and master degree from Beijing University of Posts and Telecommunications.
Selected Publications
[3] Dongxu Zhang, Boliang Zhang, Xiaoman Pan, Xiaocheng Feng, Heng Ji and Weiran Xu. "Bitext Name Tagging for Cross-lingual Entity Annotation Projection" . In COLING 2016 [pdf]
[2] Dongxu Zhang, and Dong Wang.“Relation Classification via Recurrent Neural Network”. arXiv:1508.01006. [pdf]
[1] Dongxu Zhang, Bin Yuan, Dong Wang, Rong Liu. “Joint Semantic Relevance Learning with Text Data and Graph Knowledge”, In the third Workshop of CVSC, ACL 2015. [pdf]
(Find more papers at Google scholar)
Experience
- Amazon.com, Inc. (6/2018-8/2018)
Mentor: Subhabrata Mukherjee and Xin Luna Dong.
- Institute of Computational Linguistics, Peking University (5/2017-7/2017)
Advisor: Sujian Li
- Naturali, Beijing, China (2/2017-4/2017)
Mentor: Dekang Lin
- Blender Lab, Rensselaer Polytechnic Institute (4/2016-6/2016)
Advisor: Heng Ji
- CSLT, Tsinghua University (10/2014-3/2016)
Advisor: Dong Wang
- PRIS Lab, Beijing University of Posts and Telecommunications (9/2013-09/2014)
Advisor: Weiran Xu
- Samsung Telecommunication R&D Center, Beijing, China (5/2013-9/2013)
Mentor: Xiaojie Yu
(Add me at Linkedin)
Code & Data
KBP37
KBP37 is a dataset for sentence level relation classification task. Compared with SemEval-2010 Task 8, this dataset has longer relation mentions in average length. See my paper for more details. The dataset is a revision of MIML-RE dataset. Please cite this paper and MIML-RE for the use of research purpose. In this dataset, an acceptable system should reach at least 50% in F-1 value. (Much harder than SemEval-2010 Task 8)
Animal-143
This dataset contains semantic relatedness scores between 143 pairs of animal names. The score of each pair was evaluated by 9 people in CSLT, and the average is used. This dataset can be regarded as a domain-specific evaluation dataset. (And the well-known WordSimilarity-353 dataset can be regarded as a general-domain dataset.) In this dataset, an acceptable system should reach at least 0.7 in spearman coefficient.
Sentence classification
Here you can find CNN+Pooling code for relation classification, sentiment classification and answer selection implemented with Tensorflow.
(Download more resources from Github)
Misc.
NLP / DM / ML Conference Deadlines
AAAI 2019 August 30 / September 5, 2018
ICLR 2019 September 27, 2018
WWW 2019 October 29 / November 05, 2018
AKBC 2019 Nov 16, 2018
NAACL 2019 December 3 / December 10, 2018
ICML 2019 Jan 7th, 2019
KDD 2019 Feb 3, 2019
ACL 2019 March 4, 2019
IJCAI 2019 Feb 19th/ 25th, 2019
EMNLP-IJCNLP 2019 May 21st
CIKM 2019
NIPS 2019