To create a word list, click Word Count tab on the window.
Basic functions
Creating word/n-gram list
Select a type of list to create and click Count button.
With Word Count, you can create a word list, and 2-5-gram list. The scope of n-gram lists is the same as the Scope of Context
in the Preferences. If the scope is Paragraph, n-grams are counted
within a paragraph (across sentences). "the Preferences. If the" is
counted as one 4-gram "the preferences if the" in Paragraph mode.
You
can have two word lists with different corpora/database files or word
list and n-gram or two different n-grams or any combination of these on
one window.
Word List
A word list contains frequency, the number of files a word appears, and proportion of a word in the corpus.
n-gram list
n-gram
list only contains frequency because it consumes more memory. If I
find a way to add other info without taxing the processing time and
memory, I will add them.
Setting minimum frequency
Just
as you can do with Cluster and Collocation, you can set a minimum
frequency of words/n-grams to be displayed on the table. Go to Preferences -> Others. Setting a
minimum of at least 2 is recommended for n-gram search because n-grams
with a frequency of one do not give you any information but consumes
more time to process and more memory.
Sorting word/n-gram list
You can sort or resort the list.
- Frequency - most frequent to least frequent
- Alphabetical - a to z
- Reverse Alphabetical - a to z from the last letter of each word/n-gram
- Word Length - number of letters in a word/n-gram from longest to shortest
- Reverse Word Length - reverse order of Word Length
- Stats - highest to lowest
- Stats Reverse - lowest to highest
The Stats and Stats Reverse are only available after you calculate keyword statistics.
Here's the sample of
Reverse Alphabetical.
Additional features
Calculating keyword statistics
If you have two word lists from two different corpora, you can calculate keyword statistics. First, create two word lists. You can do this by switching corpora and running Word Count on two tables, or you can run Word count on the right table and import saved word list to the left table.
Then go to Menu -> Stats -> Calculate Keyword Stats and select either Log-Likelihood or Chi-Square. Log-Likelihood is calculated based on the formula on the UCREL site. Chi-square is calculated based on the formula in Oaks (1998, p.25).
The calculated statistic will show up on Stats column. You might want to sort the results by Stats or Reverse Stats.
The
statistic of the words that tend to appear more frequently in the
corpus on the left will be displayed in black and that tend to appear
more frequently in the corpus on the right will be displayed in red.
Searching a word/n-gram on the list
You can search a particular word or n-gram on a table. Simply enter the word/n-gram you want to search and click Search.
If CasualConc finds the word/n-gram you are looking for, it will be highlighted.