Formal Run Results


"System-Run" represents a sequence of a system name, the target language, the target subtask, and a run number.

Accuracy is the ratio of the number of correct answers to the number of pairs of t1 and t2.

Macro-F1 is the F1 measure averaged over labels (Y/N, E/C/U, or F/B/C/I). F1 measure is the harmonic average of recall and precision.

x-F1, x-Prec. and x-Rec. represent the F1 measure, precision and recall for the label "x", respectively.

The columns of JA, JB, MS, PE, WA and WB show correct answer ratios for multiple-choice questions of the National Center Test.