CasualConc

© 2008-2009 Yasu Imao
How to Use‎ > ‎

Exploring Word Lists

To create a word list, click Word Count tab on the window.



Basic functions

Creating word/n-gram list

Select a type of list to create and click Count button. 


With Word Count, you can create a word list, and 2-5-gram list. The scope of n-gram lists is the same as the Scope of Context in the Preferences.  If the scope is Paragraph, n-grams are counted within a paragraph (across sentences).  "the Preferences. If the" is counted as one 4-gram "the preferences if the" in Paragraph mode.


You can have two word lists with different corpora/database files or word list and n-gram or two different n-grams or any combination of these on one window.


Word List

A word list contains frequency, the number of files a word appears, and proportion of a word in the corpus.


n-gram list

n-gram list only contains frequency because it consumes more memory.  If I find a way to add other info without taxing the processing time and memory, I will add them.



Setting minimum frequency

Just as you can do with Cluster and Collocation, you can set a minimum frequency of words/n-grams to be displayed on the table.  Go to Preferences -> Others.  Setting a minimum of at least 2 is recommended for n-gram search because n-grams with a frequency of one do not give you any information but consumes more time to process and more memory.


Sorting word/n-gram list

You can sort or resort the list.


  • Frequency - most frequent to least frequent
  • Alphabetical - a to z
  • Reverse Alphabetical - a to z from the last letter of each word/n-gram
  • Word Length - number of letters in a word/n-gram from longest to shortest
  • Reverse Word Length - reverse order of Word Length
  • Stats - highest to lowest
  • Stats Reverse - lowest to highest

The Stats and Stats Reverse are only available after you calculate keyword statistics.

Here's the sample of Reverse Alphabetical.



Additional features

Calculating keyword statistics

If you have two word lists from two different corpora, you can calculate keyword statistics.  First, create two word lists.  You can do this by switching corpora and running Word Count on two tables, or you can run Word count on the right table and import saved word list to the left table.


Then go to Menu -> Stats -> Calculate Keyword Stats and select either Log-Likelihood or Chi-Square.  Log-Likelihood is calculated based on the formula on the UCREL site. Chi-square is calculated based on the formula in Oaks (1998, p.25).


The calculated statistic will show up on Stats column.  You might want to sort the results by Stats or Reverse Stats.


The statistic of the words that tend to appear more frequently in the corpus on the left will be displayed in black and that tend to appear more frequently in the corpus on the right will be displayed in red.


Searching a word/n-gram on the list

You can search a particular word or n-gram on a table.  Simply enter the word/n-gram you want to search and click Search.


If CasualConc finds the word/n-gram you are looking for, it will be highlighted.



Import Word/n-gram List

You can import a Word List to the left table.  Go to Menu -> File -> Import WordList.



You can also click the Import Icon.


You can choose File Format and File Type.  The encoding of the file should be ASCII or UTF-8.


File Format is the two types of formats CasualConc can export.

You can import CasualConc Word List CSV file or other simple Word list files.

Simple Word List (w/ Title Line) has a title line at the top and followed by word and frequency on each line.  For example:

Word,Frequency
the,1394
a,1284
I,804


Simple Word List (w/o Title Line) only contains word and frequency on each line.  For example:

the,1394
a,1284
I,804


This allows you to use word list created by other programs to calculate keyness with CasualConc.