Customers often will ask if we have a redaction tool. Redaction means something very specific in a legal sense. We'll explain more in depth later in this article, but the very short answer is:

No, we do not offer "redaction tools" but it is possible to use our tools as part of your own redaction solution.

Definitions and Legalese

Redaction has a very specific meaning: when you redact content on a document you are utterly destroying the information you are redacting.. in a way that is not recoverable. It is used for "Blacking out" text and/or images which need to be hidden from those viewing a document. Specificaly though to be redacted, the information must be destroyed totall so that it is not recoverable.

Our annotation objects are NOT redactions. Yes, you can draw a black rectangle annotation over an area of a document.. and set it so it is not resizable, not movable, not rotatable.. in essense, "lock it down" but since the annotation "floats" in a layer above the document, it does NOT redact because the original image underneath is still there.

To properly redact an image, you need to take that annotatoin and BURN it. The process of burning permenently overwrites the area of the image with the annotation.. and if that annotation is say, a solid black 100% opaque fill, the image may be considered redacted.

However, there are some caveats: if your original document was a searchable PDF and you replace the image, if the text layer is still present, you MAY NOT HAVE DESTROYED the text. There have been stories in national news about secret/classified documents that an entity thought they redacted but did not do so properly and resulted in the redacted data being uncovered.

So, this gets the legalese part: Atalasoft does NOT provide "redaction tools". It is possible to create a recation tool using our components, but the responsibility to fulfil the criteria of "properly redacted" remains entirely with you, the developer writing the application using our tools.

We will provide some guidance here but this is key: you are creating a redaction tool using our components. It is entirely your responsibilty to ensure you'ver fully destroyed the data you want redacted.

Technical Gotchyas

In a simple case of a non-searchable image such as TIFF, Jpeg, BMP etc.. generally all you need to do is to draw an black rectangle annotation over the area, then use our burn features (the specifics of this varies with our different controls how you go about this) to create a new image where the pixels under the black rectangle are completely replaced with just uniform black. This is good enough to destroy the image part of the data.

Metadata Tags in Image Files (Tiff/Jpeg)

However, there are Metadata tags that are possible to embed in TIFF and Jpeg..in TIFF images, these are called TIFF tags (a TIFF image is actually just a collection of TIFF tags - TIFF stands for Tagged Image File Format). Some processes such as Microsoft's now obsolete MODI (Microsoft Office Document Imaging) may have included an OCRed copy of the document text in a special tiff tag. If you have a TIFF file which contains metadata that contains a text translation.. you would also need to clear the text intended for removal. You can use our TiffFile class to get at the individual tiff tags, Support can assist with how one would go about this, but finding and removing the content is your responsibility.

Likewise, there are other metadata type tags such as IPTC and EXIF tags which can be embedded in Jpeg and TIFF images.. if there was content needing redaction in those tags, you would need to use classes in our Atalasoft.Imaging.Metadata namespace such as ExifParser and IptcParser.

Searchable PDF

Searchable PDF files may either contain the text as the page data itself or may have an image representing the page, but contain text underlaid that provides a text representation of the content that was OCRed from the image. When you decode a PDF as images and apply a burn to redact an area, you're making a new image. Unless you take extreme steps involving swapping out the image but leaving the text layer in place (not officially supported but possible using PdfGeneratedDocument) then you may have a situation where the searchable text in the document still contains the text removed. As the developer creating a redaction tool, it would be your responsibility to ensure you do not inadvertently leave searchable text visible.


