The CoNLL2010 Shared Task [Farkas et al.] Mild bladder wall thickening {raises the question of cystitis} - When marking the keywords, the minimal unit that expresses hedging and determines the actual strength of hedging was marked as a keyword. - On the other hadn, the scope was extedned to the largest syntactic unit possible. - The scopes of the BioScope corpus are regarded as consecutive text spans and their annotation was based on constituency grammar. - True positives were scopes which exactly matched the gold standard cue phrases and gold standard scope boundaries assigned to the cue word. - Nested scopes have overlapping text spans which may containt cues for multiple scopes (there were 1058 occurrences in the training and evaluation datasets together). The XML format utilizes id-referentces to determinethe scope of a given cue. Nested constructions are rather complicated to represent in the standard IOB format, moreover we did not want to enforce a uniform tokenization. - Task 1: 85.0/87.7/86.4 - Task 2: 59.6/55.2/57.3 - Each Task2 system was built upon a Task1 system, i.e. they attempted to recognize the scopes for the predicted cue phrases (however, Zhang et al have argued that the objective functions of Task1 and Task2 cue detection problems are different because of sentences containg multiple hedge spans). [VER] - Most systems regarded multiple cues in a sentence to be independent from each other and formed different classification instances from them. There were 3 systems which incorporated information about other hedge cues of the senence into feat space, and Zhang et al. constructed a cascade system which utilized directly the predicted scopes during prediting others scopes in the same sentence. - Usan FIRST/LAST/NONE - Sequence label vs token classification - Los features set son los mismos en el Task2, agregando features que relacionan la cue phrase in el token (por ejemplo, dependency path) [Vlachos et al.] our discussion of Task2 focuses on identifying the scope of a given cue The featrues used by the classifier to predict whether a token belongs to the scope of a particular cue are based on the shortest syntactic dependency path connecting them. [Ji et al.] We consider this problem as a word-cue pair classification problem, where word is any word in a sentence and cue is the identified hedge cue word. A word-level lienar classifier is trained to predict whether each word-cue pair in a sentence is in the scope of the hedge cue. [Morante] "scope tags separated by a space, for as many cues as there are in the sentence" An instance represents a pair of a predicted hedge cue and a token. All tokens in a sentence are paired with all hedge cues that occur in the sentence. [Rei & Briscoe] We find a scope for each cue predicted in the previous step. If a cue contains multiple words, the are each processed separately and the predictions are later combined by postprocessing rules. The tagging sequence from the manual rules is used as inputo to a second CRF, along with other features. The output of the classifier is a modified sequence of FILO tags. Next, scopes are constructed from tag sequences using the following rules: Scope start point is the firs... Finally, scopes are checked for partial overlap and any instances are corredted. |