Data

Trial Data - Released July 31, 2020

The trial data is available here.

The trial set includes 99 MWEs from 3 domains: 29 bible, 33 biomedical, and 37 Europarl, and 421 single words: 143 bible, 135 biomedical, and 143 Europarl.


Training Data - Released September 4, 2020

The training data is available here.

The training set includes 1,517 MWEs from 3 domains: 505 bible, 514 biomedical, and 498 Europarl and 7,662 single words: 2,574 bible, 2,576 biomedical, and 2,512 Europarl.


Test Data - Released January 11, 2021

The test data is available here.