Atalasoft provides developer Text Recognition SDKs that can be integrated into your desktop or web applications for both manual and automated batch processing of images.  Our industry proven document transformation engines can save countless hours and provide a familiar, customizable experience for your development team.  If you have any questions please let us know sales@atalasoft.com

DOWNLOAD NOW

DotImage® OCR Engines and Features

OCR By Atalasoft

Atalasoft's Document Transformation Engines

Today's digital document libraries need to be searchable and workers need to be able to index and pull data from within these documents.  This can be a very slow and expensive process compared to an automated computer application. Our toolkit allows OCR engines to be implemented by extending our base OcrEngine class. The Recognize() method is used to start the process. Additionally, we have partnerships with the following OCR Engines:

  • GlyphReader OCR Engine
  • Abbyy OCR/ICR Engine
  • Tesseract OCR Engine

Optical Character Recognition (OCR) is a method by which software "reads" the text characters to preform text recognition from an otherwise flat, scanned image. The resulting text can be placed anywhere programmatically and is necessary in larger document workflows and for discoverability.

Intelligent Character Recognition (ICR) follows the same software concept but is tuned to recognize hand printed rather than computer printed text. To do ICR you need to clearly define the areas that need to be recognized, text should be in block caps only with framing.

OCR is an add-on to our DotImage SDK

Searchable PDF Creation

Creating Searchable PDFs

Atalasoft offers several OCR Engines that can be used to OCR documents or as part of a process to create Searchable PDFs

If you would like to create serachable PDF's using Atalasoft SDK's you would need our DotImage SDK, an OCR SDK, and our Searchable PDF SDK (PDF Translator) add-on to OCR.  If you also need to view and search / highlight after they are created (or if you already have existing searchable PDFs) you would need our PDF Reader with Text Extraction SDK

PDFTranslator (Searchable PDF SDK). This module automatically translates an image into a searchable PDF file. Simply call Translate() in any of our OcrEngines.

GlyphReader OCR Engine

GlyphReader OCR Engine

GlyphReader has 2 main strengths - it is highly accurate and it is very cost effective. Because you can rely on the quality of the output, you can process more jobs with less time spent correcting mistakes. This accuracy has been developed through years of comprehensive testing, analysis and improvement.

  • Closed source OCR engine that vectorizes glyphs then determines all possible letters it could be.
  • Supports the European Character Set
  • Reports individual character position and size
  • Reports character confidence
  • Properly OCR's rotated pages, reporting the rotation angle
  • Has Auto-Rotate functionality, rotating documents to the correct orientation 
  • Can automatically break merged characters, or merge broken characters
  • Can disable recognition of specific characters
  • Can optionally reject low confidence characters
  • Can optionally reject low confidence lines
  • Full Page color OCR can be generated when combined with the Searchable PDF Module

To develop and deploy using GlyphReader OCR:

GlyphReader can be deployed to the desktop or as a server based application. 

  • Each developer would need a copy of our DotImage SDK and GlyphReader SDK
  • For desktop deployment:  Each desktop requires a 1 thread runtime license for GlyphReader
  • For server deployment: Each server requires a DotImage server license as well as a GlyphReader OCR server license which can be purchased in groups of 5, 10, 15 or 20 threads.
Abbyy OCR/ICR Engine

ABBYY OCR Engine

ABBYY is a global leader in the development of document recognition, content capture and language-based technologies and solutions that integrate across the entire information lifecycle.

  • A fast, closed-source engine for OCR and ICR
  • Supports 201 languages with a high accuracy rate
  • Reports individual character position and size
  • Reports character confidence
  • Can optionally reject low confidence characters
  • Can optionally reject low confidence lines
  • Support E13B and CMC 7 MICR fonts
  • Multiple recognition culture support
  • Parallel processing to improve performance. (Single document)
  • Autorotate

Interested in downloading ABBYY OCR - See this Knowledgebase article for ABBYY instructions and full OCR/ICR languages supported.

To develop and deploy using ABBYY OCR/ICR:

Abbyy OCR is for desktop or server based applications. 

  • Each developer would need a copy of our DotImage SDK and Abbyy SDK
  • For desktop deployment:  Each desktop requires a runtime license for Abbyy OCR
  • Each server requires a DotImage server license as well as an Abbyy OCR server license.
Tesseract OCR Engine

Tesseract OCR Engine

Tesseract OCR is an intelligent learning open-source OCR engine with many extended language options including integrated support for the languages Dutch, English, French, German, Italian, Portuguese, and Spanish

  • Atalasoft tests additional language add-on packs for: Chinese(Simplified), Chinese(Traditional), Danish, Finnish, Greek, Hebrew, Japanese, Korean, Norwegian, Russian, Swedish, Turkish
  • Tesseract provides additional language add-on packs here: http://code.google.com/p/tesseract-ocr/downloads/list
  • Ability to determine character, word, and line size and location
  • Reports confidence of each recognized character
  • Output to Text or Searchable PDF
  • Tesseract3 Engine

To develop and deploy using Tesseract OCR:

Tesseract OCR can be deployed to the desktop or as a server based application. 

  • Each developer would need a copy of our DotImage SDK and Tesseract SDK
  • For desktop deployment:  Desktop deployment is royalty free
  • For server deployment: Each server requires a DotImage server license as well as a Tesseract OCR  server license
Show More

Atalasoft’s DotImage has enabled us to offer the browser-based scanning solution that our customers were looking for

Larry Oliver
President - FileHold

Try Dotimage - free for 30 days with full support

Download Now