CRAFT Shared Tasks 2019
Integrated Structure, Semantics, and Coreference
Integrated Structure, Semantics, and Coreference
Hosted as part of the BioNLP Open Shared Tasks, the CRAFT Shared Tasks of 2019 will be the first ever shared tasks to use the diverse annotation types made available by CRAFT spanning structure, semantics, and coreference.
Three separate tasks will be offered as part of CRAFT-ST 2019.
The Colorado Richly Annotated Full Text (CRAFT) corpus consists of 97 full text journal articles selected from the Mouse Genome Informatics curation pipeline. These articles were manually annotated for a wide variety of language phenomena. Structural markup includes sentence segmentation, tokenization, part-of-speech tags, grammatical dependency, treebanking, document section boundaries and typography (e.g., italics, boldface, subscript, superscript). For coreference, the corpus has been annotated with nominal coreference relations, including identity and appositives, for all coreferring base noun phrases. Semantic markup includes all mentions of concepts explicitly represented in ten Open Biomedical Ontologies (OBOs) in these articles have been identified and mapped (“normalized”) to specific classes from the three Gene Ontology hierarchies, Chemical Entities of Biomedical Interest ontology, Molecular Process Ontology, NCBI Taxonomy, Protein Ontology, Sequence Ontology and the Uberon anatomical ontology as well as to a set of extension classes defined as logical combinations of proper OBO classes.
Development data: CRAFT v3.1.3 -- annotations for 67 articles that have been released publicly.
Evaluation data: annotations for 30 articles that have not been previously released