Manage Learn to apply best practices and optimize your operations.

How to use a PDF redaction tool with a redacted document policy

It may seem rudimentary, but sensitive data commonly leaks out of corporate networks in plain sight in the form of un-redacted documents. Such files -- those still containing hidden data or Microsoft "Track Changes" data -- can potentially lead to serious breaches of personal or sensitive data. In this tip, Michael Cobb explains how to fully and securely redact electronic documents.

Releasing electronic documents without properly preparing them ... can cause serious breaches in data security. It's an obvious yet common way for sensitive data to leak out of an enterprise.

I used to receive lots of questions from readers concerned about security issues with the "Allow Fast Saves" option...

in various Microsoft Office products. The idea behind the "Fast Save" feature was to speed up the process of saving a document or presentation by saving only the changes that were made and appending them to the original document. This meant that the saved document could contain metadata, such as comments and deleted text. Anyone could use a text editor or Word's "Recover Text" function to view the text that had been "deleted." Also when these documents were converted to another file format, such as HTML, the "deleted" text would often be included in the new document.

This and Word's "Track Changes" functionality have created some embarrassing security breaches over the years, wherein sensitive and secret information has been unwittingly disclosed to people who either weren't intended to see it or not cleared to see it. Documents that were published using Word, for example, revealed UK government doubts over controversial plans to hold terror suspects while another revealed private annotations of a list of political donors.

Similar problems of inadequate redaction -- the editing and preparation of text for publication -- are now becoming commonplace with PDF documents. Late last year, HSBC Bank USA N.A. exposed details of electronically filed bankruptcy proceedings (.pdf) involving U.S. customers by failing to properly redact the documents they published online. Earlier in the year the transcript of the closed hearing on the Facebook/ConnectU settlement did have all references to the settlement's financial terms redacted, however, the redaction wasn't performed correctly, as the sensitive information was simply covered up with white boxes. A simple copy and paste revealed the founders of ConnectU received a $65 million settlement.

As you can see, releasing electronic documents without properly preparing them for publication can cause serious breaches in data security. It's an obvious yet common way for sensitive data to leak out of an enterprise network. Not only should enterprise information security processes include a redacted document policy, but staff members also need to know how to accomplish redaction correctly. Proper electronic redaction is the complete removal of content from an electronic document, making it irretrievable and unavailable for view, print, search or copy. Redaction requires the right tools and training so the redactions are permanent.

Ironically, one good step-by-step guide on redaction has been made available by the federal court for the Northern District of California, the very court which issued the Facebook transcript I mentioned earlier.

There is a PDF redaction tool built into Adobe Acrobat Pro 8 and 9, and if your organization regularly publishes documents in PDF format, staff should be instructed on how to use them. Adobe recommends redacting content by selecting "Redaction" from the "View," then "Toolbars" menu. To set the appearance of the redaction marks, click "Redaction Properties." Next, select the "Mark For Redaction" tool and mark items you want to remove by either double-clicking to select a word or image or pressing Ctrl as you drag to select a line, a block of text, an object or area. To redact the marked items, click "Apply Redactions" in the Redaction toolbar. Always use the "Examine Document" function to search for and remove any information that may be hidden. The items aren't permanently removed from the document until the document is saved. When saving the document, it's wise to give it a new filename and add its new classification level to the document's properties.

There are third-party redaction products available as an alternative to upgrading to a Pro version of Adobe. Appligent Document Solutions was the first company to provide redaction tools for PDF documents with the release of the Redax plug-in for Adobe Acrobat . Informative Graphics Corp. offers the well-known Redact-It and has a series of excellent white papers onhow to redact documents available free on its website. There is also a Redact-It Enterprise edition for Microsoft SharePoint, which automates the removal of sensitive content and privacy information published via SharePoint while leaving the source document untouched.

For more information

Learn more about enterprise PDF attack prevention best practices.

Are Word files safer in transit than PDF files? Read more.

If you don't have the budget to buy additional redaction software, Adobe has a useful document called Redaction of Confidential Information in Electronic Documents (.pdf) which explains how to redact PDF documents without the tools available in the Pro versions.

For anyone still using earlier versions of Microsoft Office that feature the "Allow Fast Saves" option, deselect it by going to the "Save" tab via the "Tools," then "Options" menus. You should also always perform a full save when finishing working on a document and save it for the last time as well before you share it with other people, transfer the document text to another program or convert the document to a different file format.

With regard to avoiding sending a Word 2003 or earlier document that still contains tracked changes or comments, select the "Warn before printing, saving or sending a file that contains tracked changes or comments" option on the "Security" tab reached via the "Tools," then "Options" menus. You may also want to consider Microsoft's Remove Hidden Data tool, which removes hidden and collaboration data, such as change tracking and comments, from Word, Excel and PowerPoint 2003/XP files. However, also read the Knowledge Base article titled "Known issues with the Remove Hidden Data tool" to avoid any costly errors . For those of you using later versions of Office, check out the open source Word 2007 Redaction Tool project on CodePlex.

Whatever tool your organization chooses, you must provide appropriate training for staff. Including examples of some poorly redacted documents in order to highlight the dangers would be a good idea: There's nothing more powerful than real-life examples to make a point memorable. Also, the training should include detailed steps on how to redact different types of documents. These instructions can be taken from the manual accompanying whichever tool you use or from the advice provided by Adobe and Microsoft themselves.

A properly redacted document must clearly show where the original text was located, so as to boldly indicate its removal. The purpose of the blackout is to leave clear evidence of deletion and not give readers the impression that the removed text could have been anywhere and of any length. Obviously, when ensuring metadata and track change information is removed, this is not necessary.

Redaction helps to protect intellectual property, as well as information considered to be sensitive or private, and is an important security process for any organisation that handles classified, sensitive or private information. Redacting poorly can be costly.

About the author:
Michael Cobb, CISSP-ISSAP is the founder and managing director of Cobweb Applications Ltd., a consultancy that offers IT training and support in data security and analysis. He co-authored the book IIS Security and has written numerous technical articles for leading IT publications. Mike is the guest instructor for several Security Schools and, as a site expert, answers user questions on application security and platform security.


This was last published in July 2010

Dig Deeper on Data security strategies and governance