Watermarking – invisible watermarks (secret) or visible (public)?
Invisible watermarks are great for protecting images from copyright infringement but don’t work so well with documents. This blog explains why visible watermarks are more effective as a document copy deterrent.
What are we trying to achieve with watermarking?
Watermarking is a complex subject that requires some research in order to understand what problems you are trying to solve and how effective different techniques prove to be when considering digital information, particularly PDF documents.
The range of things we can want to do includes:
- Hiding information from the public view
- Proving that the information is authentic
- Proving the owner of the information
- Identifying the authorized user of the information
- Preventing editing photographic copies or transforming the original without being detected
There are two basic types of watermarking as far as the human eye or a camera are concerned – and they are invisible watermarks and visible watermarks. So we shall look at each in turn.
Invisible watermarks, also known as steganography, the science of making information that is hidden or covered are so-called because it is not obvious to anyone who does not know that steganography has been used and so they can’t detect it.
A secret meaning may be ‘hidden’ in a picture which, with one apple on a table has one meaning, and with two apples means something entirely different – if you know what you are looking at. In the same way in WWII messages were ‘hidden’ in radio broadcasts passing instructions or information to resistance fighters, but being meaningless to anyone who did not know what messages there were, and when to listen. There is no direct attack against these forms of steganography – short of finding the code book(s) for interpreting the meanings.
An invisible watermark has to contain enough information to either identify the Copyright Owner and/or the licensed user. That means that an invisible watermark needs a background in which to hide or be hidden. Commonly this means having a picture (a video screen still) into which, according to a set of rules (they vary with the various schemes that are in existence), pixels with a particular coding or definition replace existing pixels. These form one or more patterns, usually encoded, which can be extracted by re-processing the picture following the rules for the scheme.
As you can imagine, with video or sound there is a huge amount of information flowing, and so plenty of ‘room’ to hide invisible watermark information so that it is, to all effects, an invisible watermark because the human eye cannot detect it.
All well and good.
As usual, the devil is in the detail. There has to be enough picture information to ‘hide’ the watermark information so that it is invisible and cannot be removed.
Invisible watermarks and text files
A file of ASCII text has no spare space in it to hide anything, so the technique is not going to work with txt files.
It might work if you considered the page of text to be a graphic (rather like PDF can) but there is still the problem of where to hide the information so that you can find it later and prove it is reliable and not some mere accident.
Invisible watermarks and images
Invisible watermarks have been commonly used in images to prove the owner of the information and establish copyright.
If you are going to add an invisible watermark to an image, then the image file must have plenty of space inside it to hide it or the watermark will become visible by degrading the picture. Invisible watermarks generally use image files because they use a lot of pixel data, so changing a pixel here and there does not degrade the picture. The reason for this is that the stored file has to be reproduced exactly, not a bit more or a bit less, or it doesn’t work. So to make it work you need one or more large pictures in which to hide the invisible watermarks you want to send and do quite a lot of processing when you receive them.
There are also problems with this approach in that if the form or format of the picture is changed it may not be possible to maintain the state of your modified pixels – a change from png to jpg for instance, or changing to grayscale from colour may be enough to prevent recovery of the hidden files. This problem was well researched in the 1990’s and there are lots of patents for methods of preserving the hidden information, and of recovering it if some gets lost.
Invisible watermarks and PDF files
To create a PDF file with an invisible watermark, the original file which is the PDF document would have to have the invisible watermark added to it by an application before being distributed, or would have to be able to sort out whenever there was the opportunity to add an invisible watermark to a page and add it dynamically when it is viewed and/or printed. The PDF file format would have to support dynamic watermarks and the PDF Reader be able to add them dynamically. At the moment this is not possible.
So if you add an invisible watermark to a PDF document then it can only contain static rather than dynamic information as the facility to dynamically add invisible watermarks is not available. If you therefore wanted to identify the user of the PDF (for example see who uploaded a PDF file to a file sharing site) you would have to process the same PDF file each time for each user in order to apply unique identifying information. This clearly involves a lot of work and document management which could be made futile if the invisible watermarks can be easily removed.
Removing invisible watermarks
When using invisible watermarks you need to consider how does one make sure that the watermark survives screen grabbing or printing, or change of format? After all, PDF documents that are not protected in any other way can be saved into a different format, or printed as physical documents and then run through a scanner. The invisible watermark is then lost.
There have been proposals to create invisible watermarks where pictures can survive transformation from (say) png to jpg but they are less clear about what happens if pictures have their size and shape changed, or if they are printed and then scanned. Also, images that are screen grabbed are in a different format from the original and may avoid the invisible watermark. These approaches to defeating invisible watermarking mean that a recipient who can obtain a document that is not fully controlled can effectively bypass the controls and continue to do what they like – albeit they have to invest more effort than they might like.
So invisible watermarks for documents can only be made effective if combined with a document control system such as DRM to prevent conversion to other file formats otherwise they could be easily removed.
These turn out to be very useful indeed when used with secured documents. The fact that these watermarks are visible makes them much more secure to implement than invisible ones because everyone can ‘see’ the watermarks so can test if they look right. No amount of changing the view on the screen or the printed page can completely lose these watermarks. See PDF watermarking.
Visible watermarks also have history. They are already well established for protecting the authenticity of bank notes, cheques, International Agreements and so on. They provide a way of preventing forgery, although trying to forge encrypted documents is probably more difficult unless you can completely break the encryption/decryption scheme in use.
Watermarks on cheques (checks if you prefer) are passive. Once set up when printed they can tell you about the organization that created them, but nothing about the authorized user of the document. It is true to say that cash has no owner but a credit card transaction does. And so does the user of a protected PDF document. And that is because a secured PDF document is licensed for use while a banknote is not – it is authorized for whoever is holding it. So secured PDF documents allow you to identify the recipient as well as the owner and/or supplier, which is a great deal more powerful than previous watermarking methods.
Protected PDF watermarks can be powerful in other ways. Most importantly, you can use dynamic watermarks (user info automatically inserted at view/print time) so you only have to protect a document once for all users rather than invidually for each user in order to identify a user. Watermarks can be placed on top or underneath the content to be watermarked, but work independently from it. You can consider the watermarks as separate layers of content and have them placed to minimize harm to the screen view of the document while being much more ‘in your face’ if a printed copy is made.
Like any security problem, if there is enough money and experts and time available high resolution cameras can photograph every screen and watermarks can be removed pixel by pixel. But very few indeed would find it worth the tremendous effort when it would be cheaper to just buy a copy. And PDF secure watermarks have the power to reveal who the licensed user of the document is. So if they allow their access and authority to be misused they may well be held to account. And that is using a watermark as a deterrent, something the printed document world had not implemented.
And secure PDF watermarks continue to provide the traditional capabilities of identifying the Copyright owner and making it difficult to make a false document appear to be genuine (admitted that because PDF documents are encrypted it is difficult to obtain a copy that can be fed into an application and re-processed even without the watermarks).
Conclusions on invisible vs visible watermarks as a copyright deterrent
Locklizard have come to the conclusion that the point of making a secure document is to prevent unauthorized use of a ‘secret’ so what point would be served by hiding a secret inside a secret. The point of steganography is to hide secrets in public information, but securing documents is about keeping the information from being available to the public.
So although there are some interesting security capabilities we have identified that can be achieved by using steganography, they are not as useful when the DRM infrastructure is being used to prevent the information from being made public. So although invisible watermarks may be useful for establishing copyright for images that are made publically available, they are not much use for identifying users.
Locklizard take the approach that document watermarks that are clearly visible are more effective. We do not allow recipients to get control of protected documents so they cannot process the content in order to remove our watermarks, and screen grabbing captures the watermark just the same as the content, leaving the user with the problem of how to remove the watermark that is obvious. We allow publishers to forbid printing, avoiding the problems of scanning print out, with a back-stop that the print out can have obvious watermarks that have to be removed if the document is going to be re-distributed.
The range of visible watermarking options supported by Locklizard is significant. You can have both text and graphics watermarks running both on top of and underneath the content layer, and these may be as subtle as the publisher wishes to go since there are no limitations on the selection of graphics or fonts. All of the watermarking information is itself protected to prevent it from being tampered with or overridden. Using visible watermarks avoids the problems inherent in the use of steganography, is more effective, simpler to implement and more reliable.
We understand the attractions of the invisible watermark concept and approach in presenting the idea that a watermark can be applied that does not detract from the quality of the image shown on screen. However, we have not been convinced as to the efficacy and effectiveness of the invisible watermark when compared with use of a fully controlled secure PDF DRM solution using visible watermarks that can even resist someone using a cell phone to make a copy of a document.