Evaluation

The evaluation will be done using micro average F1-measure for the relevance, document sentiment and aspect-based sentiment sub-task.

For the OTE subtask we will compute micro-averaged F1 based on exact and partial overlap.


An executable jar, usage instructions and sources can be found on https://github.com/muchafel/GermEval2017.

The test-data will be published both in xml and tsv. We will publish the data in the following manner:

  • The fields, which are present in every document (relevance and document sentiment from Subtask A and B), are given a label unknown, which must then be replaced by the participants.
  • The optional fields (Aspects and OTEs , from Subtask C) and D)) are completely removed.
<Document id= XXX">
        <relevance>unknown</relevance>
        <sentiment>unknown</sentiment>
        <text>Wie schön es ist wenn man sich nach nem nervigen Arbeitstag auch noch über die Bahn ärgern muss.</text> </Document>

To further illustrate the test data, please find a transformed version of the development set here