CasualConc

© 2008-2009 Yasu Imao
How to Use‎ > ‎

Finding Collocation Information

To search collocations, select Collocation tab on the window. 




Basic functions

Collocation search

Enter the word(s)/phrase(s) with which you want to find collocation information and click Search. You can use wildcard search. See Wildcard Search for more information. The current version has search history from which to choose previously searched words/phrases.


The result will look like this.  With the latest version, the number in the most frequent spot(s) (L5-R5) will be in red. 


What you will see is a list of words that appear within 5 words to the left and to the right of the search word(s) and their frequencies in each position between L5 and R5, total frequencies between L5 and L1, total frequencies between R1 and R5, and total frequency between L5 and R5.

If you search multiple words or wildcard search that returns more than one keyword/phrase, collocations will be calculated for each keyword/phrase.


Sorting results

You can specify the column you want to sort with or resort after your search.


Setting minimum frequency

You can specify the minimum frequency to be displayed on the table.  Go to Preferences -> Others and just type any number or use the stepper to change the number.

The words that occur less than the Min. Freq. will not show up on the list.  This is useful when you want to calculate collocation statistics.


Coocurrence

If you click Cooccurrence tab, you can see lists of words in each of L5 - R5 in the order of frequency (most frequent one comes on top).



Frequency counts are not displayed, but I might add the feature in the future.


Additional features

Calculating collocation statistics

If you created a word list with the same set of files/corpus/database file on Word Count, you can calculate collocation statistics. Go to Menu -> Stats -> Calculate Collocation Stats. You can choose one of the statistics on the list.


The calculated scores will appear the column next to the context words column.  Because collocations of words that only appear once in the context does not provide much information, you might want to specify the minimum frequency.  Also you might want to (re)sort the results by Stats.  The collocation statistics calculated here are all based on the frequency within 5 words right and left of the keyword.



You can also manually enter the numbers and calculate collocation statistics of a certain context word with the Collocation Stats Calculator.  Go to Menu -> Stats -> Collocation Stats Calculator.  These statistics are mostly based on the formulas on the BNC Corpus site.


If a word list is already created, the total number of words in the corpus is already filled.  You need to enter other values manually, though CasualConc can help you find frequency of the node word/context word.  Enter the words in the box and click NW Search (node word) or CW Search (context word) button.  The number will be taken from the word list in Word Count.  The span of the context words can be specified here.  Now this includes Fisher's Exact Test.  To include it, go to Preferences -> Others and check Include Fisher's Exact Test.


You can also calculate collocation statistics of a selected word on the table.  First select a word on the table by clicking the line.  Then, go to Menu -> Stats -> Calculate Collocation Stats of Selected Items.


A panel shows up with calculated statistics.  This function is designed to work with multiple context words, but the accuracy is not verified.  If you could check the accuracy, I'd like to know if this works.  This also optionally includes Fisher's Exact Test.



CasualConc also has a Contingency Table Calculator.  Go to Menu -> Stats -> Contingency Table Calculator.



Type values in the four boxes and click Calculate.  To include Fisher's Exact Test results, turn it on in Preferences -> Others.


Treat keywords as one word

This experimental option is mainly to adjust collocation statistics for a multi-word search, such as language/languages.  By default, CasualConc treats collocations for each keyword separately.  For example, second language and second languages are treated as two separate collocations.  This function force CasualConc to treat keywords as one word, so second language/languages are treated as single collocation.

To activate this option, go to Preferences -> Others and check Treat Keywords as One Word.


The result should look like these:

Not Checked (default):


Checked:


As you can see, when the option is activated, frequencies of the context words for both keywords are aggregated (two keywords are listed separately, though).