Special Issue on Parsing Morphologically Rich Languages

In the context of computational linguistics, parsing is the task of automatically analyzing the syntactic structure of sentences in natural language, providing information that is crucial for further semantic processing and downstream applications. Although the performance of parsing systems has in general improved tremendously in recent years, there is increasing evidence that performance is highly sensitive to typological differences between languages. Thus, statistical models for phrase structure parsing developed for English often exhibit a drastic drop in performance when applied to languages such as German, Arabic, French and Hebrew. Similarly, multilingual evaluation campaigns for statistical dependency parsers have shown considerable variation in accuracy across languages that seem to be related at least partly to typological characteristics. In both cases, it appears that the greatest challenges are posed by morphologically rich languages (MRL), where significant information concerning syntactic structure is expressed at the word level, where each word can have a very high number of possible forms, and where word order is weakly constrained by syntactic structure.

The challenges exhibited by MRLs transcend language boundaries, and emerging insights are often relevant across different theoretical frameworks and methodological traditions. Considering parsing research from the point of view of MRLs therefore sheds light on the generality and adequacy of currently available state-of-the-art parsing methods for dealing with complex linguistic phenomena, vis à vis morphosyntactic interactions. This special issue aims to provide the focal point for studies of large-scale, broad-coverage parsing models that can successfully cope with the challenges exhibited by MRLs, from both the formal and the statistical points of view. It sets out to provide an overview of the state-of-the-art solutions, shared insights across languages and frameworks, and lessons relevant to downstream applications (such as machine translation of MRLs).


We solicit novel contributions describing completed work on broad-coverage parsing of morphologically rich languages, from formal or statistical points of view, in a single or multiple frameworks. We encourage contributions that emphasize how particular methods respond to the challenges associated with parsing MRLs and morphosyntactic phenomena, and go beyond the idiosyncrasies associated with individual languages.

The range of topics to be covered in the special issue includes, but is not limited to:

  • Parsing models and architectures that explicitly integrate morphological information into models for syntactic analysis.

  • Cross-language and/or cross-model comparison of models’ strengths and weaknesses in the face of linguistic phenomena associated with MRLs (e.g. rich inflectional paradigms, different degrees of word-order freedom, etc.).

  • Comprehensive analyses of the strengths and weaknesses of parsing models with respect to variation in tagsets, annotation schemes and additional data transforma- tions that help to accommodate rich morphosyntactic interactions.

  • Evaluation of parsers involving different parsing frameworks (e.g. grammar-based or data-driven approaches) or different syntactic theories (e.g. constituency-based or dependency-based) for MRLs.

  • Parsing models and architectures that can successfully cope with high variation in word-form and improved handling of OOV words either by incorporating linguistic knowledge or through the use of unsupervised/semi-supervised learning techniques.


In order to provide a wide exposure to the state-of-the-art in the field, allowing us to cover multiple frameworks as well as multiple languages that exhibit different structure and characteristics, the extended editorial board of this special issue will use a new format with multiple short papers of length up to 25 pages (excluding references). 

Submitted papers must follow the CL formatting guidelines available at

Potential contributors are invited to send an expression of interest (EOI) to the guest editors. The EOIs should consist of a title, the language(s), and a brief indication of the topic. This will help the editorial board determine the typological reach of the issue and the required language-specific expertise for the reviews. 

EOIs and inquiries should be directed to the guest editors via cl



Call for papers: December 20, 2010
Deadline for expression of interest: February 20, 2011 
Deadline for submissions: September 30, 2011
Notification: February 20, 2012
Revisions:  June 2012
Final notification: July 2012
Final version:    September 2012
Publication Date: TBD


Reut Tsarfaty (Uppsala University, Sweden)
Djamé Seddah (Alpage & Université Paris Sorbonne, France)
Sandra Kübler (Indiana University, USA)
Joakim Nivre (Uppsala University, Sweden)


1° Is the expression of interest mandatory to submit a paper for this special issue? 
No. But we would really appreciate a quick mail stating that you're thinking about it.

2° Can I still submit an EOI after the suggested date ?
Yes, of course you can. Please do and do not hesitate to contact us if you have any questions.

3° How much supplementary work is required compared to a workshop (eg. SPRML2010) or conference paper:
This issue is not a post workshop/conference proceedings. A submitted paper should provide
significant insights about the points being studied and of course it should match the usual
quality expected by the Computational Linguistics journal.

4° Why did you extend the deadline?
Due to a clash of deadline and reviewing periods with other events and CL initiatives, the CL editorial board and guest editors
have decided to postpone the submission deadline for this special issue until after the summer.

5° How many EOIs have you received?
So far we received 25 EOIs covering 14 languages and different techniques. The deadline having
been extended, it is of course possible to send more EOIs for new papers. As a general recommendation
we encourage contributions that emphasize how particular methods respond to the challenges associated
with parsing MRLs and coping with morpho-syntactic phenomena. Specifically, we are interested in papers
that go beyond reporting incremental improvements for a single language, either by generalizing to several
languages or by providing a deeper analysis that can pave the way for such generalizations.

6° Problem between CL latex style and arabtex.sty
The arabtex latex package currently available presents some incompatibilities with the CL latex style.
To fix this problem, please copy the file alocal.sty in the same directory as the article source.
(Thanks to Pr. Klaus Lagally, Arabtex's author, for his help).

7° How rigid is the paper length restriction?
As most electronic journals have it, and according to the CL-board guidelines, the 25 page limit is a recommendation.
Thus, a paper that does not strictly respect this length restriction will still be reviewed. However, any divergence from
the recommended length has to be justified by the scientific content of the contribution.

8° Problem with hebrew mode, CL latex style v2 and pdflatex
Serious Issues have been reported using the current CL latex style and Hebrew mode. The problem is currently 
being working on. Please contact us as soon as possible if your are experiencing some troubles getting your paper
compiled. For the initial submission, a general style can be used. If you really want to try a temporary workaround 
on Mac Os X systems: the solution involves installing a complete fresh distribution of MacTexLive-2011 with xelatex
as the main latex frontend, David's Hebrew fonts (both  installed on system fonts library and in the texmf root directory),
installing the culmus-latex-0.7r1 hebrew package and having the bibtex style fullname.bst in your latex paper directory.
Moreover the package bidi must be called after the xlltra in your latex file header.
Arabtex's hebrew mode does not function.

9 Is submissions to the CLPMRL Special issue supposed to be anonymous?
No, CL has the policy that "Computational Linguistics does not do double-blind review: authorship of submissions is 
known to the editorial board and the reviewers. As such, it is not necessary to anonymise the manuscript".

10 When are notifications going to be sent?
Notifications will be sent within a few weeks at the most (in all cases, before February 20, 2012). The review process took
longer than we expected. We apologize for any inconvenience this may have caused.