Please contact Atalasoft if you are interested in acquiring an interface for other OCR engines.
- Fully extensible file and stream export
- OCR Engine neutral, open API
- Built-in image preprocessing
- Fully overridable image preprocessing
- Easy event model for tracking progress and reporting/modifying document layout
- Fully extensible document and page model
- Font abstraction
- Confidence provided at region, line, word, and glyph levels
- OCR any image that can be read by DotImage
- Easy integration with Twain Capture
- Images can come from any source, not just files
- Output formats specified by MIME standard
- Built-In Text Translator for formatted text output
- Searchable PDF module for outputting results in highly compressed JBIG2 Adobe PDF as Text Only, or Hidden Text Underneath Image.
- Supports engines that automatically localize regions (or zones) of an image, or manually zone images yourself.
- Support for Tesseract OCR Engine
- Support for GlyphReader OCR Engine
- Support for Abbyy OCR Engine
Object Model Design
Atalasoft DotImage OCR is designed to easily interface with other aspects of your application, and extensible with an event driven object oriented object model. In just a few lines of code, a developer can recognize an image and output to a file, or enumerate through the lines, words, and characters with confidence.
This diagram represents the high level design of the OCR module.


For more information about the design, and how to use OCR, please
download an evaluation.