Linguistic Phenomena
A sentence pair (t1 and t2) in a part of System Validation subtask dataset has a category label related to a linguistic phenomenon.
The two tables at this page list such category labels for Japanese subtask. (category labels for Chinese subtask)
Only a linguistic phenomenon indicated by the category label is involved in the decision whether t1 entails t2 in the pair.
A single linguistic phenomenon involved in entailment decision can affect more than one part of t1.
Insertion of a comma (,) is not regarded as a linguistic phenomenon now.
Therefore, a linguistic phenomenon and insertion of commas can be involved in entailment decision.
We have made a list of category labels on the basis of ones proposed in the following related work.
- Bentivogli et al. (2010) Building Textual Entailment Specialized Data Sets: a Methodology for Isolating Linguistic Phenomena Relevant to Inference.
- Sammons et al. (2010) “Ask not what Textual Entailment can do for You…”
Category label "*:phrase" such as "entailment:phrase" and "disagree:phrase" is related to miscellaneous linguistic phenomena.
These labels are so-called "the others" labels and a sentence pair where expressions in t1 correspond to expressions in t2 with complicated alignment is annotated with one of the labels.
These labels can be subdivided into several sub-categories in the future.