CasualPConc is my first attempt to create a parallel concordancer. The feature set of CasualPConc is very simple and it can only handle two corpora at a time. This is mainly due to my lack of experience and need of a parallel concordancer. I started this project just because a couple of people asked if I was going to make a parallel concordancer. So whether this projects proceeds depends on people's feedback. If enough people use this and give me feedback (or tell me what should be included or how things are dealt with), I might spend more time to further develop this program. I got one request to add a feature to deal with more than two corpora, and that's why I started CasualMultiPconc project.
To use CasualPConc, you need corpus files that are aligned. This means you need separate files for two corpora and the two matched files should be aligned. "Aligned" here means both files should have the same number of paragraphs (text separated by line break character) that are matched. With the current implementation, CasualPConc ignores blank lines, so if a matched sentences/paragraphs appear in the same position (nth sentence/paragraph), the number of blank lines does not matter. CasualPConc can also read a single file with aligned text. This means single file with two parallel corpora in CSV or tab-delimited or if two aligned sentences are next to each other in the same order throughout the file and separated from other sentence/paragraph pairs by a blank line. Other format may be supported upon request. You can open unaligned files and check the alignment on CasualPConc.
The supported file formats are plain text (.txt), Rich Text (.rtf), MS Word (.doc/.docx), and OpenOffice (.odt/sxw). But to be safe, plain text files (.txt) are recommended. For plain text files, various text encodings are supported, but UTF-8 is the default and recommended.
Once you read your corpus files into CasualPConc, you will create a text database for analysis. You can also create a database from the files by importing CSV/tab-delimited text files and text files with a certain format. The database file can be saved for future use.