In the TEI version of the DISCO corpus, enjambment tags according to our typology are added in an enjamb attribute. A certitude value for the enjambment tag is added inside a cert attribute.
We explain here the content of the enjamb and cert attributes based on the following example:
<l n="1" ... enjamb="B-pb_noun_prep"
cert="medium"> ...</l>
<l n="2" ... enjamb="B-cc_crossclause I-pb_noun_prep"
cert="high medium"> ...</l>
<l n="3" ... enjamb="I-cc_crossclause"
cert="high"> ...</l>
The enjambment tags correspond to our typology, described [here].
The B or I prefix in the tag indicates whether the line is the first (B) in the line-pair under enjambment, or the second line (I). B stands for Beginning and I stands for Inside.
In the example above, tag B-pb_noun_prep on line 1 indicates that an enjambment of type pb_noun_prep starts on line 1. On line 2, the I-pb_noun_prep tag indicates that this is the second line in the enjambed-line pair starting on line 1.
Some lines (like line 2 in the example) can be the second line in a given enjambment, and the first line in another one. So they bear enjambment tags prefixed with both B and I.
In case both a B and I tag appear in the enjamb attribute, the B-prefixed tag always precedes the I-prefixed tag .
If there are both a B and an I tag in enjamb, they're separated by a space.
These values are in the cert attribute of the line elements. When there's both a B-tag and an I-tag for the line (see above), there will be two values in cert, separated by a space, that correspond to the B and the I tag respectively (ie. values inside cert values follow the same order as the tags inside enjamb that these cert values refer to).
The certitude values for enjambment tags were defined based on our automatic enjambment detection tool's F1 scores for each enjambment type. The evaluation procedure involved 100 sonnets between the 15th and 19th centuries, see Per-type table for the SonnetEvol corpus [here] for more details.
The F1 ranges for each cert value are as follows:
1F1 is expressed as a 0 to 100 range here.
Based on the above ranges, each enjambment tag in our typology receives the following cert values: