CasualConc - Beta features - Collocation/Cooccurrence

Regular Features

Collocation Span

You can specify spans of collocation to be displayed on the table from L5 to R5.

If you select L3 ~ R3, the result will look like this:

Ignoring Keywords

With the next version, frequencies of the key position is not included in the total (LR Total) but counted and displayed.

If you check Ignore Keywords in Preferences -> Others -> Collocation, you can completely ignore frequencies in the key position.

With this corpus, 'government occurs 596 times in the key position and 606 times in total, but does not appear on the table.

Sorting Cooccurrence Table

You can sort Cooccurence result by the following criteria.

The result of sorting by frequency.

The result of sorting by MI.

Including frequency info in the Cooccurrence export

You can now include frequencies of each word on the cooccurrence table. To enable this, go to Preferences -> Others -> Collocation and check Include frequency info in the Cooccurrence export.

The exported result will look like this:

Experimental Feature

This feature is totally experimental. If you find any errors/strange results, please report it to me.

Visualization of collocation information - BASIC

You can visualize collocation information. You need to run Word Count before you use this feature.

First, search a word in Collocation and then click Visual.

Visualizer panel appears.

Basic Settings

You need to specify which collocation information to use for visualization. You have the following choices to use.

Then, select a position of collocation or a span.

Position

Span

Finally, select the number of words to use. The number of words specified from the top of the table will be used for visualization. You can also select a font for display.

Click Process to visualize the collocation information.

The followings are the results using Frequency and MI.

Frequency

MI

Visualization of collocation information - ADVANCED

Ignore zero occurrence

You can delete words that does not occur at a specified position or within a specified span. Check Ignore zero occurrence.

If zero frequency words are excluded from the above MI sample, the result will look like this:

Include frequency information

You can include frequency information in the visualization with another statistics valued. Check Include Freq Info.

The above MI result with this option on will look like this:

If both Ignore zero occurrence and Include Freq Info options are on, the above MI result will look like this:

Converting LL values to log values

Since the distribution of Log-likelihood values is Zipf-like, you can convert LL values to a log scale.

Normal

With log option

Using multiple collocation indexes

You can incorporate multiple collocation indexes for visualization. To enable this feature, check Use Multiple Info checkbox.

You can assign three indexes to three colors.

In this example, Blueish or Greenish font colors mean relative values of z-score and Log-log are low compared to a relative value of Log-Likelihood. But since the actual values of each statistic can vary a lot, the displayed color scheme may not reflect a true relationships among statistics. I need to figure out the way to visualize the optimum relationships among statistics values. If you have any suggestion, I'd most appreciate it.

Statistics Table

You can check the collocation indexes values by clicking Stats button.

The Stats Values table will appear.

If you enable multiple indexes, all of them will be on the table.

You can sort values by clicking a header of any of the columns.