CasualConc

© 2008-2009 Yasu Imao

How to Use


Here's manual and step-by-step guide of CasualConc.


Managing Corpus Files
    1. File Mode - Simple
      This page explains how to handle corpus files in CasualConc.

    2. File Mode - Advanced
      In advanced mode, you can manage corpus files in groups.

    3. Database Mode - Simple
      You can create a SQLite (database) file for faster search.  This is suitable for repeated searches.

    4. Database Mode - Advanced
      You can manage database files in advanced mode.

    5. XML Information Tag Handling
      This is still an experimental feature.  If your corpus files are tagged for information, you can pre-select them for your search.
Tools
    1. Concordancing (Concord)
      This is the main feature of CasualConc.  You can create kwic concordance lines and sort by context words.

    2. Searching Word Clusters (Cluster)
      You can search word clusters by search word(s).

    3. Finding Collocation Information (Collocation/Cooccurrence)
      This tool provides information about coocurring words in context for your search word(s).  You can calculate simple collocation statistics.

    4. Exploring Word Lists (Word Count)
      You can create a simple word list and n-gram lists.  Keyword statistics are available for word list.

    5. Getting File Information (File Information)
      This provides basic information of corpus files, such as type, token, type-token ratio, number of n-letter words.

Preference
  1. General
    You can set search word/context word settings as well as switching between simple/advanced corpora handling mode.

  2. Files
    You can set file types and text encodings, file viewing/editing settings, and default folders.

  3. Tags
    You can specify tagged areas to be ignored from the analysis.

  4. Lemma
    You can specify a lemma file for lemmatization and a keyword grouping file.

  5. Concord
    You can specify some features of Concordance, such as language settings.

  6. Others
    You can specify settings of other tools (currently only one setting for collocation is under Others).

  7. XML
    You can specify types of XML info tag and tag title to filter the files for the analysis.

Common Features
  1. Wildcard Search
    You can use wildcard characters in search of kwic concordance lines, word clusters and collocation information.

  2. Other common features
    • Saving/Exporting/Opening results
      You can save/open the results on a table in a CasualConc format file and export results in CSV format.
    • Search in Concord
      You can search a word/cluster/collocation on a table in Concord.
    • Moving result to the other table
      You can move results between tables in Cluster and Word Count.

Miscellaneous Features
  1. Lemma Handling
    You can apply lemmatization to results.  You need to prepare lemma file (such as Prof. Someya's e-lemma) to use this feature.

  2. Keyword Grouping
    You can create groups of words to search together in Concord/Cluster/Collocation

  3. East Asian Language Support
    You can analyze East Asian Languages (2-byte character languages) with some limitations.

  4. Concordance Plot