Welcome to the sixth of 31 applications we will post (in addition to the contest participants' submissions). Make sure you add this blog to your RSS / Atom feed and check the gallery for summaries of all the apps as they are released.
This app is a frivolous way to solve the general problem "how do I detect and act on the content of a scanned document"?
If you've seen the movie Office Space, you're well aware that all TPS reports are required to have a specific cover sheet. I've written an application that does two things - first, given a folder it will find all TIFF documents within the folder and flag any TIFF wherein the first page is NOT a TPS cover sheet. The second thing the application does is to insert a TPS cover sheet into files that are missing one.
The first task is simple, if you have the right pieces at your fingertips. Using the Tesseract OCR engine, I recognize the first page in each document and attempt to match text on the page via a regular expression. If there is a match, then the document has a TPS report cover sheet.
The second task uses our new TiffDocument class to add in the cover sheet. Because of this class, this task is literally a one-liner.
While this particular application is frivolous, you could imagine using the same structure to auto-sort documents that come in from a fax source or do keyword searching.
You can download an installer for the app here.
You can download the source here.
About Atalasoft's 31 Apps in 31 Days