Old OCRopus Wiki

Languages

Right now, OCRopus supports all the languages and scripts that Tesseract supports through the Tesseract plugin, and it supports Latin script and English for its native recognizers.  Some of the additional languages and scripts that people have expressed interest in are listed below.

If you're interested in helping with creating support for a new script or language, please do the following:
  • First, sign up under OCRopus Contributors on the left, then send an E-mail to the Gmail account of tmbdev+ocropus to get access to this wiki. 
  • Start collecting some resources about the language, such as existing commercial and open source OCR systems, collections of scanned documents, etc. and add them to the OCR Resources page.
  • When there is sufficient interest to start a project, create a Wiki page; please follow the New Language Template
Here's a list of languages people have been interested in and created pages for: