DeSR is a dependency parser for natural language sentences.
Among its notable features:
- accuracy: state of the art accuracy
- efficiency: it can parse up to 200 sentence/sec
- multilingual: it can be trained from an annotated corpus on multiple languages
- customizable: features used in training can be customized.
DeSR is part of the Tanl framework
, that provides the required tools to completely analyze sentences starting from text.
DeSR is a shift-reduce dependency parser, which uses a variant of the approach of (Yamada and Matsumoto 2003).
The parser builds dependency structures greedily by scanning input sentences in a single left-to-right or right-to-left pass and choosing at each step whether to perform a shift or to create a dependency between two adjacent tokens. Which transition to perform is learned from annotated corpora, based on features of the current parser state.
DeSR uses though a different set of rules and includes additional rules to handle non-projective dependencies that allow parsing to be performed deterministically in a single pass.
The algorithm also produces fully labeled dependency trees. A classifier is used for learning and predicting the proper parsing action.
The parser can be configured, selecting among several learning algorithms (Multi Layer Perceptron, Averaged Perceptron, Maximum Entropy, SVM), providing user-defined feature models, and selecting input-output formats (including the CoNLL-X shared task format). The MLP classifier works best for languages with sufficiently large training corpora and is fast both in training and parsing.
The parser is available from Sourceforge.
- G. Attardi. 2006. Experiments with a Multilanguage Non-Projective Dependency Parser. In Proc. of the Tenth Conference on Natural Language Learning, New York, (NY).
- G. Attardi, M. Ciaramita. 2007. Tree Revision Learning for Dependency Parsing. In Proc. of the Human Language Technology Conference 2007.
- G. Attardi, F. Dell'Orletta, M. Simi, A. Chanev and M. Ciaramita. 2007. Multilingual Dependency Parsing and Domain Adaptation using DeSR. In Proc. of the CoNLL Shared Task Session of of EMNLP-CoNLL 2007, Prague.
- G. Attardi, M. Ciaramita. 2007. Dependency Parsing with Second-Order Feature Maps and Annotated Semantic Information. In Proc. of the 10th International Conference on Parsing Technology, Prague.
- G. Attardi, F. Dell'Orletta. 2009. Reverse Revision and Linear Tree Combination for Dependency Parsing. In Proc. of the North American Chapter of the Association for Computational Linguistics - Human Language Technologies short papers (NAACL HLT) 2009 conference, Boulder, Colorado.
- H. Yamada and Y. Matsumoto. 2003. Statistical Dependency Analysis with Support Vector Machines. In Proc. of the 8th International Workshop on Parsing Technologies (IWPT).