Shared Data

We encourage participants to re-use the SemEval 2015 Task 4: Timelines dataset to provide their own annotations, interpretations, and system results. The data will be collected before the workshop and summarized to facilitate an insightful comparison. The results of the combined manual and automatic annotation of the common dataset will be used to dive the discussion around three themes:

Definitions: what is a storyline? how it can be formally and computationally formulated?
Resources: what are the core markables of a storyline? how should annotation of storylines should be performed? can existing annotation schemes be re-used and adapted for storyline annotation? how should we annotate cross-document information concerning events and character perspectives? is it feasible to develop a StoryBank for evaluation?
Evaluation: how do we determine if an extracted storyline is “good enough”? can standard measures, such as Precision, Recall and F-measure, be applied to evaluate storyline extraction or do we need different measures? should evaluation take place at a global level or it must be conducted separately on the different components of a storyline system?

The manually annotated data of the SemEval 2015 Task 4: Timelines task can be requested and downloaded at this link . In case of problems, please contact t.caselli@vu.nl .

The data can be uploaded in the CAT Tool for better exploration and for extension of new annotations.

To get access to the CAT tool please fill in this form and specify "NewsStory15" for "Intended use of the Project". For problems with CAT please contact CAT.support@fbk.eu .

Enjoy!!

Page updated

Google Sites

Report abuse