Scanning and Translating

So you have a German or French article and don't want to slog through hours of laborious translating. Here is how to move from the printed page to a (very) rough translation.

There are three basic steps, which are detailed below. I am doing this from memory, so if I miss something feel free to edit this page. It looks long and difficult, but once you do it once it is pretty intuitive.

    1. Scan the document.
      • Use the scanner on the main level in Buswell.
      • Bring a USB memory device and plug it in.
      • Instead of choosing your email address choose Store Copy and then Memory Device. Then press OK.
      • For best results set the scan settings to scan at 600 DPI.
      • Scan your document.
      • When it is done scanning press Exit. Remove the memory device when the scanner says it can be removed.
    2. Run Optical Character Recognition (OCR) on the document.
      • Go to one of the Windows computers in the back of the computer lab. They are situated behind the three Macs with Accordance.
      • Log in, plug in your memory device, and open the PDF file you just scanned using Adobe Acrobat Pro.
      • If you need to, rotate all the pages in the document counterclockwise 90 degrees from the Document menu.
      • Optional (but it saves a lot of effort): if you have multiple columns on a single page, the OCR will come out better if you separate the two columns (or pages, such as when you scan a journal or book).
        • Export the file to TIFF image files. You might do this to a new folder, such as "temp1". Each page will have one image named by page number.
        • Copy all the image files and paste them into a new folder, such as "temp2". At this point you should have two sets of image files that are identical. To make your life easier at this stage you might put this folder in C: drive of the computer or in a folder in the root of your memory device. If you put it on the computer, just be sure you remove this folder before you log out.
        • Click Start and Run... and type "cmd" and click OK. Don't freak out about using the command line—you will be okay.
        • It opens to C:\. If you named your folder with images "temp2" then type "cd temp2" and hit Enter. Note: If you put it on your memory device, it is probably in the F:\ drive. Type "F:" and hit Enter. Then follow the previous directions.
        • Type "ren *.TIFF *.1.TIFF" and hit Enter. This will rename (thus "ren") all filles with the TIFF extension.
        • Close the command prompt and copy the files in your "temp2" folder and paste them in "temp1" where you exported them from Adobe Acrobat Pro.
        • Sort the files by name by clicking on the "Name" and make sure they are in the right order (something like page1.1.TIFF, page1.TIFF, etc.).
        • Select all the TIFF files and right click, selecting Combine Supported Files or something like that. It is a command that Adobe Acrobat Pro adds to the right-click menu.
        • Open the resulting PDF file and check to make sure that you have two page 1's, two page 2's, etc.
        • From the Document menu in Adobe Acrobat Pro crop the right side of the page off odd pages and the left side of the page off even pages. Be sure to click on the "All" option so that the whole document is cropped. Now you should have one column/page from the scanned document per page in the PDF. Check to make sure that everything is there.
      • From the Document menu select OCR Text Recognition > Recognize Text Using OCR and select for all pages.
      • Check your favorite news website if it is a long document because it will take some time.
      • When the OCR is finished you should be able to select text in your document. Be sure to save this document.
      • From the File menu export the PDF file to a Word document.
    3. Run the document through Google Translator.
      • There are several ways to use Google Translator, and you might choose different ways for different documents.
      • Method 1:
        • Save your Word file as a text file.
        • Navigate to http://translate.google.com/ and choose Translate a Document. Choose a file, From language and To language (likely English), and click Translate.
        • You can edit the resulting web page or copy and paste it into a new document on your computer.
      • Method 2:
        • Use the Google Translator's Toolkit (http://translate.google.com/toolkit).
        • Upload your Word document, name your resulting translated file, select the proper languages for Translate from and Translate to, and click Upload for translation.
      • Modify your translation and download it if you like.
    • Method 3:
      • Upload your Word File to Google Docs.
      • Open the uploaded document and click Tools and Translate Document... Then choose your target language (likely English) and click OK.
      • Edit the resulting translated file and download it if you wish.