[comp.graphics] JPEG info, Information Theory, Huffman Coding

pierce@radius.com (Pierce T. Wetter III) (11/15/90)
Where to I get JPEG info?

   To get JPEG spec send $8.00 to:

 Attn: JPEG Draft Technical Specification
X3 Secretartiat: Computer and Business Equipment Manufactureers Association
311 First Street NW, Suite 500
Washington, DC 20001-2178

  Say somewhere in your letter that you want the document for review and comment.


What does JPEG mean?

   Joint Photographic Experts Group


What is the JPEG compression standard?

   The JPEG compression standard is a method of compressing images at very high
compression ratios. Because you can't get something for nothing, the JPEG
compression method lose information. 

How does it work?

   Background: According to information theory, the entropy of a message is
a function of the symbols that make up that message, and the relative 
probablity of each symbol in that message occuring.
 
  Imagine a message made up of 52 different symbols corresponding to the 
alphabet in both upper and lower case. Suppose we want to send a message to 
someone. The first thing we do is find the probability of each letter occuring.
As "Wheel of Fortune" fans know, not all letters are equally likely. The
vowels, and certain consonants are much more likely. Capital letters are less
likely still, but an upper case 'S' is more common than a lower case 'z'.
If we count all the letters in the message, and put them in an array named
'counts' we can use the following C code to compute the entropy:

double BigH=0;
long N= Number of symbols in message;

for(i='a';i<='z';i++)
    H -= counts[i]/N * log2(counts[i]/N); 

That is, the entropy equals the sum of the negative logorithm (base 2) of
the probability of a symbol times the probability of that symbol.

H is the entropy and equals the number of minimum # of bits/symbol needed  
that message. The efficiency of a coding method is equal to the entropy over
the average number of bits used to transmit that message.

Lets get back to our example message. The brute force method of transmission
would require 6 bits/symbol. Each bit pattern would have its very own bit 
pattern, with 12 left over. 

This would be wasteful, even if we had to transmit 64 different symbols. Since
lower case is more common that upper case, it would better to use 5 bits.
26 of the 32 entries determine which lower case letter to use, the remaining
6 are devided equally among the upper case letters. Immediatly following one
of these 6 patterns is a 3 bit code indicating which Upper case letter is
being used. The transmisson of a lowercase letter would require 5 bits the
transmission of an uppercase, 8 bits. Since lower case letters are many times
more common than upper case letters, the single bit that we save EACH TIME we
send a lower case letter more than makes up for the fact that we have to send
8 bits for an upper case letter.

As a further optimization, we could decide to send only one bit for the letter
'e' and 6 bits for all the other letters. (One bit to indicate 'NOT e' and
5 bits to indicate which of the other symbols is correct.) That way, we will
save 5 bits every time we use an e, and waste a bit every time we send a 
different letter. If e occurs more than 5 times as often as every other letter
in our message, we will save bits.

The end result of this is huffman coding. Each character is counted, and a
variable number of bits is assigned to its representation. Huffman, Lempel-Ziv
etc., all these compression methods work more or less like this.

Gee, someone says, what if I want to send a message like this 'aeaeaeaeaeaeae'.
That would compress to one bit a letter, '01010101010101' and is 100% efficient.
but if I take the message two letters at a time I could send that as '1111111'
(1='ae') so I can get 200% efficiency. 

Nope. If you look at the image two letters at a time, you are changing your 
symbols. You now have (52x52=2700) different symbols. In the message above,
you only have one possible symbol, 'ae', so the entropy of the message is zero,
so you can never be 100% efficient. 

If you look at your message in different sizes, the entropies will be different
and you may get better compression. For example, one character at a time, the
message 'abaeabaeab' will compress to '101100101100101' (1='a', 01='b', 10='e)
The entropy of that message is  ( -0.5*log2(.5)-0.3*log2(.3)-0.2*log2(0.2))
= 1.485 bits/message. The efficiency of the encoded message is E= 1.485/1.5 
is 99%.

On the other hand, at two characters/symbol the entropy drops to 1 bit a symbol,
and 100% efficiency is possible with only 5 bits for the entire message.

That is the difference between 8-bit, 16 bit, and 32 bit encoding,
the symbols. Note that the above efficiency assumes that the translation table
between the encoded and decoded form is sent seperately or built into the
decoding method, otherwise, it will have to be sent along with the encoded
message, which will reduce the actual efficiency.

Lossy compression methods:

  The entropy of a message tells us the absolute best we can do for a message.
To do any better than that, we will have to throw away information. For images
we can rely on certain psycho-physical aspects of the way our eyes see things.

  The JPEG process is two-fold. The first pass is to break up the image into
planes. Generally, it is a good idea to seperate out the brightness information
from the color information by converting the image to Yuv or Lab but this is
not required. The next step is to break that plane info 8x8 or 16x16 chunks. 
These chunks are converted from position and value information into frequency
information. For example an alternating black and white pattern would be 
recorded as a 50% cycle rather than as 10101010101... . The overall color of the
chunk is recorded, then the relative values of the frequency components. Then,
instead of the frequency information having continuous values, they are mapped
to the closest number of a few available values. For example 5.4 would map to
5, and 5.6 would map to 6. Information is lost because 5.4-5 is .4 and 5.6-6
is -.4.  The image no longer can have a pixel 'wave' (we're in frequency now)
with the value of 5.4 or 5.6, so information is lost. 

  When the picture is regenerated, noise approximately .4 in magnatude will
be present in the picture. 

  Because the eye is relatively forgiving of high frequency noise, the encoded
image may not look any different to eye. Additionally, other components in
the reproduction cycle (printing presses, NTSC broadcast, Video tape dropouts,
etc.) may have much larger errors. 

Won't JPEG compression make TV images look terrible?

  The broadcast standard for NTSC only has 1/2 the bandwidth for the color
I&Q channels as it does for the luminance Y channel. Therefore there is only
enough color information for every other pixel. When JPEG loses information in
frequencies above the 50% duty cycle, the resulting image loss will not be
detectable by the end receiver. Additionally, the I & Q channels are only
allowed to be within a certain range of values when broadcast. Any loss of
signal beyond that range due to JPEG is no different than that caused by
the transmission stage anyway. 

Won't JPEG compression make my photographs print wrong?

  Printing a color picture on a four color press requires the arrangement of
a multitude of different size dots. The arrangement of these dots will actually
have more noise, in general, then compression by JPEG will produce. Also there
are generally a limited number of configurations of those dots, so any 
rounding errors caused by JPEG may be masked by even bigger errors in the 
printing process. 

What's the most important thing to remember?

  Generally, only you know what the image looked like originally, so only you
will know if it has changed. If you cant tell it has been compressed with both
images next to each other, someone who has never seen either of them before 
wont be able to tell either.


Pierce

-- 
My postings are my opinions, and my opinions are my own not that of my employer.
You can get me at radius!pierce@apple.com.
(Wha'ja want? Some cute signature file? Hah! I have real work to do.