Alfresco includes integrated scanning and OCR:
It can be used to implement an end-to-end solution by collecting paper documents and forms, transforming them into accurate, retrievable information, and delivering them into an organization's business applications. E.g.:
Automated forms processing [ICR (Intelligent Character Recognition)] is used to capture data on forms that are filled inmanually using handwriting, machine print, and checkboxes.
Alfresco supports a content transformation framework—where you can plug in a third-party content transformation engine to convert a document from one format to another.