Note: This information is specific to DotImage at the time it was written and may change slightly in future versions.
Every OcrEngine has different requirements in terms of how it is deployed. Atalasoft has tried to formalize this process as much as possible as well as to provide guidelines on the mechanism for deployment. Licensing is covered in another topic. This topic covers how to ensure that an OcrEngine will be able to start and will be able to find its own resources.
In your SDK installation, you will find a folder named "OcrResources". This folder is the general folder for all supported OCR engines. Within it you will see a structure like this:
In general, most of the handling of loading and locating resources is managed by Atalasoft or by the engine itself and does not require work by the client, but in custom situations, there may be work to be done by the client to handle this.
To sort this out, let's start with a few definitions:
engine resources folder - this is the folder which contains the OCR Engine's resource files
OCR resources folder - the top level folder of all OCR Engine resources, called "OcrResources"
application folder - the folder where your application is installed
assembly folder - the folder which contains the dotImage assembly files (ie, Atalasoft.dotImage.Ocr.dll), this may be the same as the application folder
sdk folder - the folder which contains all the dotImage assembly files as installed as part of the dotImage SDK. This folder is typically C:\Program Files (x86)\Atalasoft\DotImage <version>\bin
engine module - an engine supplied dll that provides engine functionality
In order to run, some engines have two requirements: an engine module for some portion of OCR functionality and resource files that are used to configure the engine or otherwise provide necessary data or services. This may include such things as dictionaries, grammar rules, glyph shapes, neural networks and so on.
Engines that require engine modules typically need to have those modules loaded before attempting to construct a class that requires them. This presents an interesting issue in that the assembly that uses the engine module should contain the knowledge of how to find the engine module, but the engine module needs to be loaded before the module that should be able to find it is loaded.
Atalasoft tries to handle this for you when possible so you don't have to worry about it, but there are some cases where this simply isn't possible.
Options for the Developer
The developer can choose to leave the engine module in the ocr resources folder as shipped. If this is the case, then the developer must put the OCR resources folder within the assembly folder. Alternately, the developer can put the OcrResources folder in any location, but it is the developer's responsibility to load the dll. If the OCR resources folder is not in the assembly folder, the developer is required to pass its location in to the ExperVisionEngine constructor.
The developer can choose to move the engine module out of the ocr resources folder. In this case, if the engine module is put into the application folder or the assembly folder, then it should be located automatically. If the engine module is located somewhere else it is the developer's responsibility to locate it and load it. If the OCR resources folder is within the assembly folder, the developer can pass in null to the engine constructor for the path, otherwise the developer must pass the location in.
The GlyphReaderEngine requires the following resource files:
They are located by default in SDK folder\OcrResources\GlyphReader\v3.0\
Due to the architecture of the GlyphReader engine, to specify a location other than a default search path such as System32, you'll need to create an instance of the OcrResourceLoader or GlyphReaderLoader in a static constructor before any OCR code is loaded. This is the case even if the resources are in the assembly folder. There you can specify an alternate location of the resources if desired.
GlyphReaderLoader loader = new GlyphReaderLoader( "PathToFolderContainingGlyphReaderResources" );
The TesseractEngine requires its resource files to be pointed at by an environment variable "TESSDATA_PREFIX". Install the "tessdata" folder to a location on the deployment machine and then in your project set the environment variable to the absolute path of "tessdata's" parent folder. You can use this call to accomplish this:
The RecoStar engine requires RecoStar Resources which are not distributed with DotImage by default, but which may be downloaded from here: http://www.atalasoft.com/download/atalasoft.RecoStarResources.zip
Once downloaded, you can unzip (it will create a folder called RecoStar with a 5.0 subdirectory and place several files and foders under there) and place the entire RecoStar folder into the default location ( C:\Program Files (x86)\Atalasoft\DotImage 10.0\bin\OCRResources\ )
You'll need a RecoStar Loader as well:
RecoStarLoader loader = new RecoStarLoader( "PathToFolderContainingRecoStar" );