CasualConc
is a set of tools to analyze your own collection of text. You need files with
text on your local hard drive or network drive (slower). CasualConc
works best with plain text files (.txt). Other file formats (MS Word Doc, PDF, HTML, etc.) are supported, but it takes more time to process them. You might want to convert them to plain text files to use with CasualConc. CasualTextractor
(a utility program) has a function to extract text data from other file
formats and save as plain text files. The
Concord was originally designed to work with single-byte character
languages with no accent marks. The current version is Unicode compatible and can handle most of European languages (with appropriate
settings). 2-byte character languages (East Asian Languages) are also supported, but the processing speed is
much slower and the functions are limited. Unfortunately, right-to-left languages are not supported. The basic unit of analysis in CasualConc is
paragraphs, which are separated by line feed/break characters (\n or \r\n). This means context words, sort words, clusters, etc. are
handled in this unit. There is an option to force the analysis by
sentence and by file, but the processing speed of these modes are much
slower. If you want to analyze your corpus by sentences, process your
text files before the analysis (insert line break character \n after
each sentence). If your texts have carriage returns in the middle of
sentences/paragraphs (like Brown Corpus or extracts from PDF files),
you need to delete them before using CasualConc or set the scope of
analysis to File. I want CasualConc to be able to handle such files in
the future (i.e. current File as a scope mode + deleting characters
from each line, etc.), but not for now. When you run CasualConc for the first time, it creates a folder named CasualConc in ~/Library/Application Support folder. In this folder, a file named CasualConcData.ccdb will be created. CasualConc stores corpus/database information used in Advanced Corpus Handling Mode on this file. If you use Advance Corpus Handling Mode, do not delete this file (only do so when you a have problem). If you delete this file, all the corpus/database information will be gone (database files themselves will not be deleted). If you want to completely delete CasualConc
from your HDD, delete CasualConc.app, CasualConcApp.plist in ~/Library/Preferences, and
CasualConc folder in ~/Library/Application Support. If you don't
delete the latter two, they don't do any harm to your Mac (the files are simple
SQLite data file and standard property list file). |