CasualConc

© 2008-2009 Yasu Imao

Known Issues

There are some known issues (including bugs and technical limitations) for CasualConc.

General

  • End of sentence character in Scope of context can only have one character (i.e. period or other specific character) due to current implementation of sentence separator.
  • CasualConc cannot pick up certain strings of repeated word(s).  For example, if the search word string is "a a" and the text includes "a a a", CasualConc only returns one instance of "a a" instead of "a a" (first, second) and "a a" (second, third).  If you want to have two instances of "a a" in this case in Concord, search "a" with "a" as a context word.  For Cluster/Collocation, I can't think of any easy work around. 
  • In any search that returns too many hits (more than 10000), CasualConc may use a lot of memory (and a lot of time).  If your computer slows down after the result are displayed, please save the result and quit CasualConc and restart it.

Concord

  • In the context view (the lower text box), search word might not be properly colored.  This typically happens with .doc, .pdf. or other files that contains special multi-byte symbol characters.
  • The kiwc output does not align if certain characters that are combined to make new characters (i.e. two letters are combined to make one character on the screen).  I don't know if there is any easy way to fix this because I'm not familiar with those kinds of languages.
  • If the selected concordance line to view a larger context in the context view has identical concordance lines in the same file, CasualConc cannot determine which one is the selected one (except for 'File' is selected as Scope of Context).  So if your corpus is a spoken one, frequent phrases, such as "I see.", would causes a problem.  I will try to fix this in the future.  This would be less likely for written corpora.
  • The function to open concordance results in a new window is experimentally implemented.  This feature might not function properly.

Cluster

  •  Keyword coloring does not work if you enable lemmatization.

Collocation/Cooccurrence 

  • If you search collocations of mulitiple words and want to search concordances of multiple keywords, the results on the concordance table will not be accurate.  This is due to the limitation of Concord.  It cannot detect different multiple combinations of keyword-context word.  So if you have keywords A and B and context words C and D, and you selected A-C and B-D combinations on Collocation and search concordances, you will also get A-D and B-C combinations.

Word Count

  • n-gram search requires large amount of memory and processing time.  If your corpus is large, you might want to quit CasualConc after each n-gram search (save the result first).

Statistical Analysis

  • Statistical analysis tools (keyness/collocation) are implemented experimentally at this moment.  I haven't checked the accuracy of the results.  I would appreciate if someone could test their functions and results.  Except for the Fisher's Exact Test, stats calculations seem to be accurate (Thanks, Sebastian!).  Fisher's Exact Test is not fully tested, though I checked the results against some online calculators and it seems working.

 

Misc

  • If you have created corpus database files (used in Database Mode) with CasualConc prior to version 0.9.9.1, you need to update the files to add more text files to them.  Please use Database Updater (included in the dmg) to update the database files. 
  • If you are upgrading from 0.9.8 or earlier, your user setting will be lost.  If you want to use your settings, change the name of com.apple.rubycocoa.CasualConcApp.plist in your home -> Library -> Prefereces folder to CasualConcApp.plist.  Your Corpus/Database settings in Advance Mode should be safe without doing this.