Results

Official ranking of competing task submissions to the tokenization subtask:

SoMaJo (F1 = 99.57%)
AIPHES (F1 = 99.36%)
COW (F1 = 98.98%)
LTL-UDE (F1 = 98.90%)

For comparison, the tokenizer of the Stanford tagger achieves F1 = 98.38% on these data.

Official ranking of competing task submissions to the PoS tagging subtask:

UdS-distributional (acc = 90.44%)
LTL-UDE (acc = 89.09%)
AIPHES (acc = 88.75%)
bot.zen (acc = 88.03%) [late submission]

For comparison, TreeTagger achieves an accuracy of 82.48% on these data.

Detailed results can be found in the task description paper

Michael Beißwenger, Sabine Bartsch, Stefan Evert and Kay-Michael Würzner (2016). EmpiriST 2015: A shared task on the automatic linguistic annotation of computer-mediated communication and web corpora. In Proceedings of the 10th Web as Corpus Workshop (WAC-X) and the EmpiriST Shared Task, pages 44–56. Berlin, Germany.

in the presentation slides from the final workshop, as well as in this online spreadsheet:

https://goo.gl/wxuL9r

Detailed performance figures for each individual text sample are provided as TAB-delimited text tables, so they can easily be analyzed with statistical software such as R or loaded into a spreadsheet editor.

Download evaluation results for individual files: empirist_results.zip

Page updated

Google Sites

Report abuse