Annotation Guidelines

The training data that will be provided as a gold standard have been manually tokenized and tagged according to the following guidelines:

When citing these documents, please use the bibliographic information given above and refer to the URL http://sites.google.com/site/empirist2015/.

Note that our guideline for POS tagging is an extension and modification of the standard STTS (1999) tagset, and should be read in combination with the STTS guidelines:

Overview: The part of speech tagset used for annotations:

Extensions to STTS (1999) are highlighted with blue background colour: