Evaluation
The evaluation will be done using micro average F1-measure for the relevance, document sentiment and aspect-based sentiment sub-task.
For the OTE subtask we will compute micro-averaged F1 based on exact and partial overlap.
An executable jar, usage instructions and sources can be found on https://github.com/muchafel/GermEval2017.
The test-data will be published both in xml and tsv. We will publish the data in the following manner:
- The fields, which are present in every document (relevance and document sentiment from Subtask A and B), are given a label unknown, which must then be replaced by the participants.
- The optional fields (Aspects and OTEs , from Subtask C) and D)) are completely removed.
<Document id= XXX">
<relevance>unknown</relevance>
<sentiment>unknown</sentiment>
<text>Wie schön es ist wenn man sich nach nem nervigen Arbeitstag auch noch über die Bahn ärgern muss.</text> </Document>
To further illustrate the test data, please find a transformed version of the development set here