Four teams participated to the SeeDev-binary task and submitted 6 runs:
You may check and evaluate your predictions:
The evaluation is described on the BioNLP-ST 2016 SeeDev page.
You can download all the charts and tables shown below.
Here are the global results for each run expressed in F1, Recall and Precision.
The confidence interval has been obtained by bootstrap resampling (n=100).
Each axis represents a different type of relation.
Relation types have been gathered by cluster of similar types, thus reducing the number of categories of relations.
The tick on each bar indicates the gain compared to the global results.