You can open a single PDF file and extract embedded text from it.
Open a PDF file
Select Open... in Menu -> File and select a PDF file you want to open. You can also press Command + O.
When the selected file is open, embedded text will be extracted and appear in the text area on the right.
If you check Keep info, you can keep the font information of the PDF text.
If you want to extract the text from the original PDF files while editing, you can click Extract Again button.
Zoom in/out
You can zoom in or out a PDF image by clicking + or - button.
You can also use the context menu by right-clicking on a PDF image.
Automatically Resize: resize a PDF image to fit width
Actual Size: resize a PDF image to the actual size
PDF text search
You can search text in the displayed PDF file. Simply type a text in the box above the PDF image and click Search button.
If a searched text is found in the PDF file, it will be highlighted on the PDF image and displayed on the image.
Click Search again to search the next match. Click Previous to search the previous match.
Search selected text on the PDF file in the extracted text
You can search selected text on the PDF file in the extracted text.
Select text on the PDF file and click Go to Selected button.
The matched text in the extracted text will be highlighted on the right.
Delete selected text on the PDF file from the extracted text
You can delete selected text on the PDF file from the extracted text.
Select text on the PDF file
Click Delete Selected button.
The selected text in the extracted text will be deleted and struck out.
Currently, you cannot undo drawing strike out lines.
Saving extracted text
You can save the extracted text as plain text (.txt) or rich text (.rtf)
Go to Menu -> File -> Save Text (command + s).
Select File Type and Encoding (if Plain Text is selected).
Saving PDF file
If you use Delete Selected on the PDF file (with strike-out line), you can save the file with the lines.
Go to Menu -> File -> Save Content As... and select a folder to save the file.
Formatting extracted text
You can apply format settings specified in Preferences. Click Format Text button.
You can select which particular formatting to be applied in Preferences. If you check Apply Formatting when extracted, these are applied when you open a PDF file.
Replacing characters
You can replace specified characters in the extracted text by clicking Replace Chars button.
To specify which characters to be replaced, go to Menu -> Window -> Replace Char Panel.
The Replace Chars panel appears.
Click + button and select the added line, then type a character(s) to be replaced in the left box and a character(s) to replace to in the right box. You can also delete a line by clicking - button after selecting a line.
If these are applied in the sample text, the result will look like this:
Before
After
Detecting misspelled words
CasualTextractor can detect misspelled words using OS X's Built-in spelling dictionary. You might want to process this after replacing characters and split words.
Click Misspells button.
Misspelled Word List panel appears. All the words not in OS X's spelling dictionary will be listed on the table. If no word appears on the table, click Update button.
To replace any misspelled words, check the box next to a misspelled word, and double click Correct Spell cell to type a correct spell.
Once you are ready, click Replace button to replace all the checked words with correct spells.
You can copy an incorrect spell to a correct spell cell. Check the box next to the incorrect spell you want to copy, select a line, and right-click. Then select Copy Misspelled Word.
A selected misspelled word will be copied to the correct spell cell. Then correct the spell.
Thanks to the built-in spell-checker, if you select an incorrect spell in the correct spell cell and right-click it, candidate(s) of correct spelling will be shown.
You can add words to the spelling dictionary. First, select words you want to add to the spelling dictionary.
Then click Add to Spell Dic button.
You can also select a word and right-click to add a selected word to the spelling dictionary.
You can manage Spelling Dictionary by clicking Manage Spell Dic button.
The Spelling Dictionary panel appears. If no word appears, click Read Dic button to display stored spellings (if there is any).
You can add a new spelling by clicking Add button. To remove a spelling, select one and click Remove button.
To save the changed you made to the list, click Save Dic button. You can roll back to the saved list by clicking Read Dic button.
Detecting split words
When you extract text from a PDF file, some words are split in the middle by hyphenation. CasualTextractor can detect these words.
Click Search Split Words button.
The Split Words panel appears.
You can sort the list by clicking Sort button. If the check box next to Sort button is checked, the sort is done by the second part of the split words.
Unchecked
Checked
Check the split words you DO NOT want to process.
You can two choices of processes.
Make a word - select this to remove hyphenation
Remove space - select this to remove space only (for originally hyphenated words)
This is the result of selecting Make a word. The checked split words remain as they were. Then select Remove space to process these.
After you process Replace Chars and replace split words, the misspelled words list of the same file looks like this. (Compare with the one above.)
Search in PDF
You can search selected text in the extracted text on the PDF file.
Select text in the right text area and right-click to select Search in PDF.
This simply uses the search function (shown above), so you might need to search again to find the exact match.
Regular expression search/replace
In addition to the built-in Find panel, CasualTextractor has a regular expression find/replace function.
Regular Find panel
To use the Regular Expression Find panel, go to Menu -> Edit -> Regular Expression Search (command + shift + F).
The panel appears.
Type any word/phrase and click Search button to search it in the text. CasualTextractor uses Ruby regular expression.
If you enter the 2nd text box, you can use replace function.
Replace - replace selected text (if the selected text matches the searched text)
Replace/Find - replace selected text and then search for the next match
Replace in Selected - replace all the matched text in the selected text
Replace all - replace all the matched text in the entire text
If you want to escape characters, you can check if certain characters are escaped. Click Escape button.
A drawer appears. You can simply type the search text and the text with escape appears in the bottom text box.
You can also batch replace.
Click Add to a new entry. The left column is search text (regular expression) and the right column is the to-be-replaced text.
Click Replace to process batch replace.
Click Clear to clear the table. To remove a single entry, select a line and click Remove.
You can import/export the list. Currently only tab-delimited text files are acceptable (.txt, ASCII/UTF-8).