Data Sanitizing
There are various reason that result in redaction problems, those are:
- When you are trying to hide any confidential information by obscuring or covering the content. Editors will make an attempt to hide the information with a colored rectangle or simply by highlighting the information in black.
Although these methods work for hard copy documents they are not correct for electronic documents because there are various ways to get information from the resulting PDF document. May be it is possible that the information is hidden on purpose or without it.
- Not knowing how to remove or being unaware of document meta-data. PDF as well as word document can have meta-data such as subject, author, title and keywords. The author might have no knowledge about meta-data that these applications create, and it might also not be clear unless the user knows where to search for it.
- While redacting images, you need to hide different parts of the image with various graphics such as black rectangles or reducing the size of images. This procedure is also used to redact hard copy printing material. However; this method is not very useful for distributing the documents in soft copy format.
PDF files as well as Microsoft word document are sophisticated and difficult computer data formats. Both of these documents can contain information such as graphics, text images tables, meta-data, and more all combined together.
Their complex nature makes them the most potential carriers for providing information with no intention involved, especially when you are redacting the confidential material.
Microsoft Office Word is used for preparing notes, reports and other official and unofficial material.
Adobe Reader is extensively used by many government agencies and military services for dispensing critical information. PDF offers excellent reliability and portability and allows easy distribution of information through computer networks and the internet. PDF is generally used as a format for redacting documents so that they can be published and distributed.
Before you finalize your documents after redactions, you should check that all the redaction entities are fully covered in text.
At the end, bear in mind that whatsoever you get to see on the screen, it is not necessary that it will match the content or language of the document as it would be read by the software, thereby creating a little difference.