![]() |
Steganography Concepts By David Burris, Ph.D., CCP, CSP Center for Digital Forensics © Copyright 2005, 2006 |
Steganography is the practice of disguising
a message (text, picture, video, auditory, olfactory, etceteras) within the
context of another message, picture, or communications so it will not be
noticed by those for whom it is not intended. Cryptography assumes a message
will probably be intercepted and prevents unauthorized access by jumbling the
message in a manner hopefully only the intended user can extract the
information content. Steganography
assumes the message can or will be intercepted but the hidden message will be
over looked. For additional
security, a message could be encrypted prior to being hidden.
Steganography is a
"camouflages" technique making the existence of the actual message
"invisible" to all but the intended recipient. Encrypted messages tend draws notice hence
due to their vary appearance encouraging attempts to crack the code while an
invisible message passes without detection.
Steganography was widely used
in World War II. Consider the following
example of a null cipher (unencrypted messages) used by a German spy in World
War II [David Kahn, The Codebreakers, The Macmillan Company.
Apparently neutral's protest is thoroughly
discounted and ignored. Isman hard
hit. Blockade issue affects pretext for
embargo on by products, ejecting suets and vegetable oils.
The following message may be
obtained by taking the second letter form each word and a little manipulation:
Apparently neutral's protest is thoroughly discounted and ignored. Isman hard hit. Blockade issue affects pretext for embargo on by products, ejecting suets and vegetable oils.
Pershing
sails from NY June 1.
Additional examples of steganography may be found in
sources such as Neil F. Johnson, Zoran Duric, and Sushil Jajodia, "Information
Hiding: Steganography and Watermarking - Attacks and Countermeasure." Other source frequently quoted include
examples Warren Zevon, Lawyers, Guns, and Money, Music track released in
the albums Excitable Boy, 1978; Stand in the Fire, 1981; A
Quiet Normal Life, 1986; and Learning to Flinch, 1993.
Word processors are an abundant source for new methods
to apply steganography. Word processors
normally allow for multiple fonts including fixed space (mono spaced) and
proportional. In fixed space, all
characters occupy the same space. In
proportional fonts, the space occupied by the characters is minimized to just
what is needed. Proportional fonts tend
to be more pleasing to the eye. As an
example compare the word "illegal" represented as follows:
|
illegal |
In fixed space font Courier |
|
illegal |
In proportional font Times Roman New |
The difference in spacing between font suggest interesting
methods for hiding text. Consider the
following paragraph shown first in 14 point Courier followed by 14 point Times
New Roman.
"A study of religion must
include the use of the shrines important to the religious practice. One should also consider how money is collected to support the religion. Every drop of knowledge must be scrutinized." Courier 14 point)
"A study of religion must include the use of the shrines important to the religious practice. One should
also consider how money is collected to support the religion.
Every drop of knowledge must be scrutinized." (Arial 14 point)
Assuming your browser software supports the indicated
fonts, the first copy of the above message in Courier should occupy
considerably more space as it is a mono spaced font. The second copy of the message appears in
Times New Roman which is a proportional spaced font. A close examination of the above message will
show that an extra space has been placed before and after some phrases as
highlighted below.
"A study of religion must
include the use of the shrines important to the religious practice. One should also consider how money is collected to support the religion.
Every drop of knowledge must be scrutinized." (Arial 16 point, a proportional font)
The hidden message is "use the shrines money drop." In a larger selection of text the extra
spaces are hard to detect even if fixed spacing is used. If an appropriate proportional space font is
selected the message is especially hard to detect as the space taken by each
"space" is squeezed to improve readability. If the message is in electronic format, copying the message into a word
processor, switching to a fixed space font, and turning on the switch to
indicate the presence of spaces and special characters greatly enhances the
ability to detect and read the message. This technique would be termed a variation of word or phrase shifting.
Another idea for using a word processor to hide text messages follows. Using a routine transmission, scan left to right selecting the letter in
the hidden message one at a time. When
selected, change the font of the letter (digit, etceteras) to a closely related
font that cannot be distinguished by the eye (there will usually be several
choices). Now write a program to filter
word processing documents assembling the message from characters not in the
base font. Alternately, move the cursor
across the text in a word processor and watch for changes in the font type on
the menu bar.
"The study of religion and love is complex.You must never underestimate the power of your religious beliefs."
(container in Times New Roman 16 point)
The hidden message hidden above in the 16 point Times New Roman
text is "love
your spouse" as follows:
Key:
religion -"l" in Arial
religion - "o" in Arial
love - "v" in Arial 14 point
You - "You" in Arial 14
never - "r" Arial 14
underestimate - "s" Arial 14 point
power - "po" Arial 14 point
your - "u" CG Arial 14 point
religious - "s" in Arial 14
beliefs - "e" (first) in Arial 14 point
"The study of religion and love
is complex. You
must never
underestimate the power of your
religious beliefs."
Null Ciphers depend on the fact the message is
sufficiently camouflaged to not be noticed if intercepted. Indeed, the assumption is normally the message will be
intercepted by censors but the message not noticed. It may be desirable to encrypt the message. Depending on the encryption scheme, messages may be harder or easier to
hide.
Available Techniques:
One appealing aspect of steganography is the number of
available techniques is limited only by the imagination of the users. Cryptography has rules. Steganography only depends on the cleverness of the users. Examples include:
1) Disappearing Ink: In World War
II, messages where included in normal correspondence by writing the hidden
message between printed lines of text using milk, vinegar, fruit juices, and
urine. These inks have the advantage of
being readily available in the field, are colorless when dry, yet reappear when
subjected to heat. More sophisticated
inks require more complex chemistry akin to developing photographs.
2) Music: As an example, a "yes," "no,"
or signal to start might be communicated by the way a popular instrumental or
song is initiated or terminated. An
innocent phrase at the start or end of any communications such as a radio
broadcast might be used to communicate information. Music is also an excellent vehicle for hiding
complex drawings using the techniques described below (8) for using picture
files as containers for other files.
3) Microdots: Microdots were
developed by the Germans in World War II. The allies first discovered their use masquerading as a period on a
typed envelope carried by a German spy in 1941. The original microdots were photographs about the size of a typed period
and carried about a page of information. In addition to textual information, they could accommodate technical
drawings and photographs. J. Edgar
Hoover, Director of the FBI, has been quoted as referring to microdots as
"the enemy's masterpiece of espionage." Microdots are small enough the information
they contain can be transmitted without encryption or other means of
concealment. The idea is they are so
small, they will not be noticed when included as part of normal communications.
4) Tattoos: Tattoos have
been employed for communications. Either
on body parts not normally seen or that can be disguised, e.g., use of
makeup. History records instances of
shaving a servant or slaves head and applying a tattoo. After the hair had grown, the servant or
slave became a living message carrier. The hair was shaved at the destination to retrieve the message.
5) One of the
earliest recorded uses of steganography is Demeratus notifying
6) Use
of more sophisticated techniques such as phase shift in radar, sound, light, or electrical
waves. While physically different from
the methods described below to hide graphics in graphics (8), the principals
are the same.
7) Drawings: Information
has been concealed in drawings. Information carriers may be original engineering drawings, original art,
or modified copies of popular art or engineering drawings. Lines may be thickened, shortened, or have
their colors modified slightly. Some
times the actual letters of the message have been integrated into the design.
8) Computer media including graphics, sound, and
apparently blank media. Consider graphics (pictures) , we normally take advantage
of the fact pictures are stored as an array of pixels. Each pixel has an RGB value (red, green,
blue) stored as three consecutive 8 bit numbers (frequently referred to as
bytes). The three shades of primary
color are seen by the eye as a single blended color. At 8 bits per pixel color, there will be 28
equal 256 shades of each of three primary colors. The number 0 represents the lack of that
shade (no color) and 255 is the maximum intensity for the shade. Hence the most intense red pixel would be
indicated by the RGB value (255,0,0), green by (0, 255, 0) and blue by (0, 0,
255). Pink (a shade of pink) is
represented by an RGB value of (255, 175, 175). For a more detailed discussion of image formats, saving space by
reducing color depth, and example of manipulating individual primary color
(RGB) values in individual pixels, consider DigitalImages.html.
Most eyes probably cannot differentiate between 256
shades (no color to maximum color) of the same color. The maximum value for red would be the bit
pattern "11111111" base 2 or "255" base 10. If we change the low order bit to zero, the result is "11111110" base 2
or "254" base 10. Your eye will not be
able to detect this difference. The same
would be true for the absence of the color red "00000000" base 2 or "0" base
10. If we replace the low order bit with
a one, the resulting shade is "00000001" which will not be detected by the
eye. Remember, this byte is combined
with two more primary colors and the eye sees the mixture of red, green and
blue. It is even more difficult to
detect shading variations using the combination of RGB primary colors when a
pixel is surrounded by other pixels as in a graphic (picture).
A typical picture today consists of 1024 by 768 pixels
or higher. If the low order bit of one
color is used on each pixel then a total of 1024 x 768 x 1 = 786,432 bits of
information could be hidden. If we allow
8 bits per character of information content, then the picture could hold
786,432 / 8 = 98,304 characters of information. If we use the low order bit of each color of each pixel, the total
number of pixels available becomes 1024 x 768 x 3 = 2,359,296 bits or 2,359,296
/ 8 = 249,912 characters of hidden text, graphics, or sound. Yes, the color of the picture has been
slightly modified, but at a level undetectable by the human eye.
It is important to note that pictures in excess of
1024 x768 have become common. These
resolutions allow for storing smaller pictures (the message) within the carrier
(container) picture. The same technique can
be used to store information in sound and other computer storage formats
employing an encoding scheme allowing for variations (shades) in the data. When storing images, the image is frequently
approximated using a gray scale (for drawings) or 256 color palette for
paintings. A 256 color palette only
requires 8 bits per pixel rather than 24 bits giving a 1:3 compression
ratio. The compressed 256 color versions
are frequently sufficiently close to the original to satisfy the customer. As an example, most paintings by famous
artist were originally distributed in electronic format using 256 color
palettes via the web and in multimedia applications. Not only did this format save space and
transmission time but the consumer either could not tell the difference between
the copy and original or did not care about the slight differences. Pictures, video, and sound make excellent
containers in which to hide other information including smaller pictures,
video, and text.
Note that a graphic, video, or sound file might be
compressed prior to hiding. The
compression offers a type of encryption. For even better protection, you might first encrypt compress the then
encrypt the object to be hidden. This
saves space and provides, potentially makes the message harder to detect, and
helps to ensure privacy if detected.
9) Compression is frequently used to reduce the size of a file or
message for transmission or long term storage. There are basically two kinds of compression: loss less and lossy. Loss less implies the original can be
reproduced (reconstructed) exactly (GIF, BMP). Lossy relies on the fact the user will not be able to tell the picture
or sound is different if the changes are small. Lossy compression is detrimental as the lost bits are frequently the bits
used to store the hidden message (low order bits). GIF and BMP image compression techniques
frequently allow for compression while assuring complete integrity. JPEG and many other image storage formats are
normally associated with loss of integrity during compression.
A good comparison of steganography software may be
found at http://www.jjtc.com/Security/stegtools.htm. Another excellent source of information may
be found at http://www.petitcolas.net/fabien/steganography/. A particularly easy tool to install and use
on Windows systems for GIF and BMP graphics is "S-Tools 4.0" at ftp://ftp.funet.fi/pub/crypt/mirrors/idea.sec.dsi.unimi.it/code/
(
YodaSmall.BMP (960 x 600) image006.GIF (213 x 200)
In the following example, S-Tools was started first
followed by Windows Explorer. The file
YodaSmall was then selected in Explorer and dropped into an empty spot on the
S-Tools window

You hide a graphic by first selecting it in Explorer
then drag and drop it on top of a graphic already on the S-Tools window. In this case I dropped image006.GIF on top of
YodaSmall.BMP. When the graphic is dropped,
S-Tools responds by prompting for the pass phrase and selection of an
encryption algorithm.

The graphic on the right contains the image006. Note there is no perceptible difference even
when comparing the images side by side.

To retrieve a hidden graphic first right click the
image suspected of containing the hidden image. S-Tools then prompts for the pass phrase and encryption algorithm.

If a hidden graphic is found, S-Tools then pops up a
window showing the original file name and size. Right click the file name and select the "Save as" option to complete
the retrieval process and save the results in a file.

Computer monitors represent graphics (pictures) as a series of dots refereed to as pixels. A SVGA screen is typically at least 1024 by 768 pixels. Each pixel is a mixture of three primary colors: red, green, and blue (RGB value).
![]()
Each color is represented as an 8 bit binary number. Hence there are 28 equal 256 intensities for each of three primary component colors. Each RGB component's intensity is represented by an integer in the range 0 through 255 (none to maximum intensity). RGB values (red, green, blue) are typically stored serially as three consecutive 8 bit binary numbers (frequently referred to as bytes) in graphic files. An 8 bit binary opacity value may be stored with each pixel for a total of 32 bits. The three shades of primary color are seen by the eye as a single blended color. The total number of available colors is 256 * 256 * 256 = 16,777,216 per pixel. The human eye cannot really differentiate between that many shades. The general format of a graphics file is:
| File Name | pixel |
pixel |
pixel |
||||||||
| Steg1 | R |
G |
B |
R |
G |
B |
... |
R |
G |
B |
|
| Header | byte | byte | byte | byte | byte | byte | ... | byte | byte | byte | eof |
For convenience let us assume a file where each pixel is represented by an RGB value (red, green, and blue). The size of a typical graphics file is the size of the picture in pixels times at least 24 bits per pixel. Hence a 1024 by 768 pixel graphic would occupy a file of at least (1024 * 768)pixels * 3 bytes/pixel = 239,616 bytes or 239,616 bytes * 8 bits/byte = 1,916,928 bits. Each graphic is stored on the disk with a header giving general information about the file. Typical header information includes the file type (GIF, BMP, PNG, JPG), creation date, size, and related information. The header is followed by the pixel codes.
The pixels are highly susceptible to manipulation without noticeable change in a picture content or quality. Modify the low order bits of each RGB value in the following graphic by increasing or decreasing the value of the current setting by a value in the range 1 to 6. There is a good chance you eye will not be able to discern the change. Now imagine modifying a single pixel where not all pixels are the same color. It is very difficult to tell the picture has been modified. The bits you have modified can be used to hide a second graphic within the first graphic. The more bits you modify per pixel the greater the amount of information that can be hidden with the carrier graphic. The more information you hide per fixed space, the greater the chance for detection.
compare to
To demonstrate the difficulty of detection, please note a 4 pixel square has been printed to the left of each graphic in an opposing color. A one pixel graphic is located on a horizontal line and approximately equal distance within the solid color.
Hiding Strategies:
1) Cipher selection: null or encrypted
2) For RGB images:
a) Use only the low order bit of one RGB value.
b) Use two or all three low order bits of each RGB value.
c) Encode the second graphic in the first but do not user every pixel.
d) Insert the graphic by selecting pixels according to a mathematical formula or pseudo random number generator.
3) Selection Rule: There are no rules. Take advantage of the target.
To get a better idea of your eye's ability to differentiate colors consider the following two images of the same graphic.

Both graphics were created from the same image scanned at 16 million colors. The image on the left was saved using a pallet 256 colors. The image on the right was saved using a pallet of 32,766 colors. If you are having trouble seeing the difference, focus on the petal in the lower right hand corner of each graphic.
A general purpose tool to place a Steganography watermark in "jpeg" graphics as they are served over the web is available at http://www.proxymark.org. The software was written by David Collins at the "Center for Excellence in Digital Forensics" located at Sam Houston State University Huntsville, Texas. The software runs in conjunction with your proxy server. It intercepts all jpeg (jpg) graphics as they are acquired by the web server for transmission and inserts a digital watermark. OutGuess by Niels Provos at http://www.outguess.org may be used to steg individual "jpeg" graphics.
Digital sound or light files can also be used to hide information. Typically sound is analog, a continuous wave form. Information is stored in the wave form using frequency or amplitude modulation. To digitize the sound, it is sampled as shown and a number assigned to represent the Magnitude (aptitude) of the wave. Each sampling point represents one digital datum. The more frequently the wave form is sampled, the more accurate the digital representation. The quality of the digital representation may also be improved by increasing the range of the number used to measure the amplitude. The trade off is in increased storage and inability of the observer to distinguish the difference in quality between the current sampling rate and a lower sampling rate. The higher the sampling rate, the greater the amount of Steganography material that can be stored in a fixed period of time. The following wave form could be sound or light. If light, the wave form may be in the visible spectrum or invisible spectrum, e.g., infrared. Sound may also be within or outside the range detected by the human ear.
The strategy for hiding material in a digital representations of light or sound is the same as that for pixels in the computer graphics representation. Low order bits representing the wave form may be manipulated without discernable impact to the person observing the medium.
The examples all assume a digital format. Analog carrier waves such as electricity, sound, and light may be used to hide materials using analog techniques as well. Remember the primary rule is "There are no rules.!"
After World War II, steganography techniques received little attention compared to cryptography. Two circumstances have arisen to increase interest in steganography.
1) First, governments have attempted to restrict the availability of encryption for secure communications. Generally governments desire for individuals to be able to protect themselves from other individuals but not from the government itself.
2) Secondly, interest has renewed due to the
desire to protect copyright in audio, video, books, software, and other works
available in digital format. Copyright information would
first be inserted (hidden) in the digital format prior to publication in a
public forum. Illicit copies of originals in digital format can frequently be made with ease. Illegal copies of the
digital media could be identified as they would contain the hidden copyright
which could be extracted by authorities. A down side is once the method for inserting the message becomes know,
it could be removed from other illicit copies making them it impossible to
prove they were obtained illegally. Techniques to mark digital information to protect copyright are frequently
referred to as digital copyright or digital watermarks.
One of the first major academic conferences on
Steganography topics was the International Workshop on
Information Hiding, held in Cambridge, UK, in May/June 1996 (http://www.springeronline.com/sgw/cda/frontpage/0,11855,5-0-22-1498671-0,00.html?referer=www.springer.de%2Fcgi-bin%2Fsearch_book.pl%3Fisbn%3D3-540-61996-8)
organized within the research program in computer
security, cryptology and coding theory organized by the volume editor at the
Isaac Newton Institute in Cambridge. The
Fifth International Workshop on Information Hiding held in Noordwijkerhout,
October 2002 (http://research.microsoft.com/ih2002/)
contains material of interest to both research and industry. Additional conferences and related
information may be located using Google with the phrase "International Workshop
on Information Hiding" or "steganography."
Motivations:
1)
Steganography will continue to be a topic of interest
both to provide
digital watermarks and to communicate information in hostile environments.
2)
Steganography may be especially useful to individuals
attempting to hide material from their employers and governments as
it does not attract attention. Even when
the employer or government is suspicious, they may not be able to detect what
is happening right under their noses.
Creating Steganography Opportunities:
As described in DigitalImages.html, graphic files may be represented by a pallet customized for each picture followed by a vector with one entry per pixel indexing the pallet. Assume a pallet with 32,766 colors. The index into into the pallet requires 16 bits per pixel, .i.e., 2**16 = 65,536 with values from 0 thru 32,765. Now reduce the number of colors utilized to represent the picture from 32,766 to 256. Take another look at the images in DigitalImages.html. This can frequently be done with little or no degradation to the human eye. The pallet for 256 colors however only requires 8 bits per pixel to index the pallet, i.e., 2**8 = 256 with values from 0 thru 255. Rather than saving space by utilizing 8 bits per pixel, continue to store the graphic as 16 bits per pixel. Both the extra pallet entries and the extra bits per pixel are now available to store information. Essentially more than half the space in the pallet and half the space in each pixel is now available to store steganography information. Another way to look at it is half the space utilized to store a 1024 by 768 graphic is now actually available for hidden information.
This same approach may be used with sound or almost any other medium that has been digitized. Specifically there are two approaches.
1) Using a graphic for illustration, the first approach is to not increase the resolution just the number of bits used to represent the medium. There is no loss of quality and the extra bits are all available to store additional information. The typical example would be to go from 16 bit color to 24 bit color. No matter how many bits you add to the resolution representation, you cannot increase the actual resolution of the source.
Example: You are viewing a picture with a graphics editor which states the picture contains 32,766 or less colors using 24 bits per pixel. It only requires 16 bits per pixel to represent the picture, i.e., 2**16 = 32,766. You have got to wonder why there are an extra 8 bits per pixel!
2) The second approach is to reduce the number of bits to represent the medium, e.g., from 32,766 colors to 256 colors (16 bit color to 8 bit color). There is a loss of quality but the user may not be able to perceive the loss. Again, continue to store the medium at the higher resolution. The additional bits (50% in the example) can be used to store hidden information. Changing the sampling rate for music can greatly increase the available storage to hide information.
In any case, the number of recording standards for pictures, video, and sound is increasing. Even within a specific recording standard such as "gif" and "jpg" multiple formats exist. Each recording standard and variation in formats supported by the standard provide additional opportunities to employ steganography.
References: