Collocates of Gender-based violence are extracted based on a statistic score and curated manually. Here are three visualisations representing the distribution of collocations by organisation type and document type.
The collocate overview below enables you to explore all the extracted collocates. It features an logDice filter and a filter that darkens collocates appearing in one or multiple subcorpora by organisation type. Clicking on a collocate highlights all other cases in other subcorpora if any and not filtered out.
The measure used here to identify collocates is know as logDice. It a statistic score that expresses the typicality of a given collocation. It is solely based on the frequency of Gender-based violence, the collocate and the frequency of the entire collocation. The size of the corpus does not affect the score, which means that it can be used to compare scores between different corpora. logDice is the preferred statistic measure for large corpora.
Below is a visualisation that allows you too look at all unique collocates as well as those shared by subcorpora. It features three filters. The logDice filter increases and decreases the number of collocates shown based on the their logDice score. The Found In filter allow to toggle between the number of subcopora that shares the same collocates. You can also remove organisation types by unticking their boxes in the Org Type filter. Clicking on a collocate highlights all other cases in other subcorpora if any and not filtered out.