Constituency Parsing Results:
| | | | | | | | | | | | | | | | | | | | | | | | | Domain A (answers) | Domain B (newsgroups) | Domain C (reviews) | Domain D (wsj) | Average (A-C) | | Team | LP | LR | F1 | POS | LP | LR | F1 | POS | LP | LR | F1 | POS | LP | LR | F1 | POS | LP | LR | F1 | POS | | BerkeleyParser* | 75.86 | 75.98 | 75.92 | 90.20 | 77.87 | 78.42 | 78.14 | 91.24 | 77.65 | 76.68 | 77.16 | 89.33 | 88.34 | 88.08 | 88.21 | 97.08 | 77.13 | 77.03 | 77.07 | 90.26 | | OHSU | 73.21 | 74.60 | 73.90 | 90.15 | 73.22 | 75.48 | 74.33 | 91.14 | 76.31 | 76.22 | 76.27 | 90.05 | 83.17 | 83.79 | 83.48 | 96.84 | 74.25 | 75.43 | 74.83 | 90.45 | | Vanderbilt | 75.09 | 76.78 | 75.93 | 91.76 | 78.10 | 79.05 | 78.57 | 92.91 | 77.74 | 78.18 | 77.96 | 91.94 | 87.82 | 88.00 | 87.91 | 97.49 | 76.98 | 78.00 | 77.49 | 92.20 | | IMS | 79.46 | 78.10 | 78.78 | 90.22 | 80.85 | 80.12 | 80.48 | 91.09 | 81.31 | 78.61 | 79.94 | 89.93 | 89.83 | 88.96 | 89.39 | 97.31 | 80.54 | 78.94 | 79.73 | 90.41 | | Stanford | 78.79 | 77.91 | 78.35 | 91.21 | 81.41 | 80.49 | 80.95 | 91.62 | 81.95 | 80.32 | 81.13 | 92.45 | 90.00 | 88.93 | 89.46 | 97.01 | 80.72 | 79.57 | 80.14 | 91.76 | | Alpage-1 | 80.67 | 80.36 | 80.52 | 91.17 | 84.22 | 83.12 | 83.67 | 93.22 | 82.01 | 81.04 | 81.52 | 91.58 | 90.20 | 89.62 | 89.91 | 97.20 | 82.30 | 81.51 | 81.90 | 91.99 | | Alpage-2 | 80.77 | 80.43 | 80.60 | 91.14 | 84.71 | 83.36 | 84.03 | 92.58 | 82.28 | 81.24 | 81.76 | 91.63 | 90.19 | 89.56 | 89.87 | 97.22 | 82.59 | 81.68 | 82.13 | 91.78 | | DCU-Paris13-2 | 80.02 | 79.22 | 79.62 | 91.61 | 83.13 | 82.18 | 82.65 | 93.60 | 82.92 | 82.12 | 82.52 | 92.96 | 88.43 | 88.29 | 88.36 | 97.29 | 82.02 | 81.17 | 81.60 | 92.72 | | DCU-Paris13-1 | 82.96 | 81.43 | 82.19 | 91.63 | 85.01 | 83.65 | 84.33 | 93.39 | 84.79 | 83.29 | 84.03 | 92.89 | 90.75 | 90.32 | 90.53 | 97.53 | 84.25 | 82.79 | 83.52 | 92.64 |
|
Dependency Parsing Results: | | | | | | | | | | | | | | | | | | | Domain A (answers) | Domain B (newsgroups) | Domain C (reviews) | Domain D (wsj) | Average (A-C) | | Team | LAS | UAS | POS | LAS | UAS | POS | LAS | UAS | POS | LAS | UAS | POS | LAS | UAS | POS | | Zhang&Nivre* | 76.60 | 81.59 | 89.74 | 81.62 | 85.19 | 91.17 | 78.10 | 83.32 | 89.60 | 89.37 | 91.46 | 96.84 | 78.77 | 83.37 | 90.17 | | UPenn | 68.54 | 82.28 | 89.65 | 74.41 | 86.10 | 90.99 | 70.17 | 82.88 | 89.02 | 81.74 | 91.99 | 96.93 | 71.04 | 83.75 | 89.89 | | UMass | 72.51 | 78.36 | 89.42 | 77.23 | 81.61 | 91.28 | 74.89 | 80.34 | 89.90 | 81.15 | 83.97 | 94.71 | 74.88 | 80.10 | 90.20 | | NAIST | 73.54 | 79.89 | 89.92 | 79.83 | 84.59 | 91.39 | 75.72 | 81.99 | 90.47 | 87.95 | 90.99 | 97.40 | 76.36 | 82.16 | 90.59 | | IMS-2 | 74.43 | 80.77 | 89.50 | 79.63 | 84.29 | 90.72 | 76.55 | 82.18 | 89.41 | 86.88 | 89.90 | 97.02 | 76.87 | 82.41 | 89.88 | | IMS-3 | 75.90 | 81.30 | 88.24 | 79.77 | 83.96 | 89.70 | 77.61 | 82.38 | 88.15 | 86.02 | 88.89 | 95.14 | 77.76 | 82.55 | 88.70 | | IMS-1 | 78.33 | 83.20 | 91.07 | 83.16 | 86.86 | 91.70 | 79.02 | 83.82 | 90.01 | 90.82 | 92.73 | 97.57 | 80.17 | 84.63 | 90.93 | | Copenhagen | 78.12 | 82.91 | 90.42 | 82.90 | 86.59 | 91.15 | 79.58 | 84.13 | 89.83 | 90.47 | 92.42 | 97.25 | 80.20 | 84.54 | 90.47 | | Stanford-2 | 77.50 | 82.57 | 90.30 | 83.56 | 87.18 | 91.49 | 79.70 | 84.37 | 90.46 | 89.87 | 91.95 | 95.00 | 80.25 | 84.71 | 90.75 | | HIT-Baseline | 80.75 | 85.84 | 90.99 | 85.26 | 88.90 | 92.32 | 81.60 | 86.60 | 90.65 | 91.88 | 93.88 | 97.76 | 82.54 | 87.11 | 91.32 | | HIT-Domain | 80.79 | 85.86 | 90.99 | 85.18 | 88.81 | 92.32 | 81.92 | 86.80 | 90.65 | 91.82 | 93.83 | 97.76 | 82.63 | 87.16 | 91.32 | | Stanford-1 | 81.01 | 85.70 | 90.30 | 85.85 | 89.10 | 91.49 | 82.54 | 86.73 | 90.46 | 91.50 | 93.38 | 95.00 | 83.13 | 87.18 | 90.75 | | DCU-Paris13 | 81.15 | 85.80 | 91.79 | 85.38 | 88.74 | 93.81 | 83.86 | 88.31 | 93.11 | 89.67 | 91.79 | 97.29 | 83.46 | 87.62 | 92.90 |
* Baseline models trained only on the Ontonotes WSJ training corpus. For constituents this is the publicly available BerkeleyParser (Petrov et. al ACL 2006). For dependencies this is a reimplementation of the transition-based parser of Zhang&Nivre ACL 2011 with the TnT (Brants ANLP 2000) part-of-speech tagger.
POS tag accuracies differ due to rounding errors and tiny discrepancies in evalb and eval.pl. |