[alt.sources.d] Compressing pictures

sean@ms.uky.edu (Sean Casey) (08/30/90)

Doesn't GIF allow a "plug-in" compressor? Surely it doesn't _require_ one
to use LZW... does it?

If not, then perhaps GIF and other archivers need a ``standard'' (haha)
interface to compression schemes. Some of the MSDOS terminal programs seem
to provide a standard interface to file transfer protocols.

Sean
-- 
***  Sean Casey          sean@ms.uky.edu, sean@ukma.bitnet, ukma!sean
***  rec.pyrotechnics: "Blow up or shut up."

wiml@milton.u.washington.edu (William Lewis) (08/30/90)

In article <sean.651965645@s.ms.uky.edu> sean@ms.uky.edu (Sean Casey) writes:
>Doesn't GIF allow a "plug-in" compressor? Surely it doesn't _require_ one
>to use LZW... does it?

   Yes, it *does* require you to use LZW, with no provision that I know
of for other compressions, at least in the GIF87a spec (GIF90 or whatever
it was that was just released may be different, but I doubt it, and
besides, if we don't support the old GIF decoders we may as well go
with a completely new format.) This is one of my gripes with the
GIF format, and one reason I liked the .PKX format better (it supported
pixel aspect ratios too =8) ) except that it died an early and local death...

   Anyway ... I'd like to add my own thoughts to this discussion. I've
been thinking about something like this for some time, though I haven't
done any actual experimentation yet. A couple of points however ... (Keep
in mind that everything in this post should be prefaced with "I think", 
"IMHO", or "What is everyone else's opinion on"; but as a rudimentary
form of article compression I have collapsed all of those notices into
this one.)

   * Taking differences is probably much, much better than XORing, 
since the point is to take advantage of the vertical continuity of
the image. The problem is that it doubles the number of values the
compressor will have to deal with -- from max->min to min->max, which
is 2*(max-min) values. The obvious solution is to wrap around, ie, 
use modulo(max-min) arithmetic, except that this overlays some of the
least frequently used values (the extremes) on the most frequently used
ones (small values), making the huffman/arithmatic coder less
efficient. I don't know whether this outweighs the advantage gained
from the smaller number of values though. I guess that would be
determined by experiment. Or, since it is a fairly trivial-to-code
change, it could be determined individually for each image,
based on that image's characteristics. This would only take
extra effort in the compression stage, not decompression, which is
what we want.

   * Arithmetic coding (or Huffman) is probably better than Lempel-Ziv.
LZW compresses common strings in the data, which will be rare in
images (such as ones with large numbers of colors or grayscales) which 
have much `noise'. Huffman or arithmetic coding compresses frequently-
ocurring single sample values, which should be nicely grouped already
because of the differencing in the previous step. 
   For those less familiar with arithmetic coding: Arithmetic coding and 
Huffman coding are very similar. Both attempt to compress the data by
representing more common values with shorter bitstrings. However,
arithmetic coding has a number of advantages over Huffman; see below.

   * Pixel data are not the entire image; obviously, any format
needs some sort of header, if only to include a magic number and image
dimensions. Palette, title, pixel aspect ratio, output device of
preference, etc., may also be useful. However, all of these
should have some value which indicates "I don't know" to the
decoder; pictures converted from .GIF wouldn't have a wellknown
aspect ratio, for instance. The decoder could use the most convenient
ratio or attempt to guess from the image's dimansions and its
knowledge of popular scanners and displays. Or even (gasp!) ask
the user. =8)

   * Since this format is intended to be used on the Usenet, which
is not exactly a transparent medium (8 bit or otherwise...) it should
probably have some uuencode-like format. Ideally this would be done similar
to PBMPLUS' "packed" vs. "normal" formats, except "packed" would
be the norm for storage and transmission on 8-bit-clean paths
(like FTP). A different "magic number" in the header could let the
decoder differentiate. It would be a good idea to take some hints from
xxdecode and ABE (if that's the right program; I haven't really
looked at either) as to error tolerance and detection. One of
the problems in a.s.p. seems to be that signatures and the like in
multi-part postings get interpreted by uudecode, inserting spurious
data in the decoded file, which throws the LZW off; maybe
the encoder could directly support splitting into parts (this
gets more complex with every sentence ... =8) ). Arithmetic
encoding would be able to handle this fairly easily by setting the
output character set to some carefully selected subset of ASCII.

   * Arithmetic coding has been proved by someone to be "at least as
good as" Huffman, and in many cases better, because output bitstreams
don't have to be an integer number of bits. Also, adaptive arithmetic
seems to be a lot easier to implement than adaptive Huffman, since
you can just readjust the intervals. However, since the input data
has already been somewhat processed, it might not be necessary to
use adaptive compressing (although it would make the postings
smaller, it would make the compressor bulkier, slower and more bug-prone...)
Again, something for experiment to determine.

  ... looks like this has expanded into more than "a couple
of ideas" ... comments, anyone? opinions? flames? I think I'll
try experimenting with this; does anyone have a source of
good large-color-palette images (I'd use a.s.p, but the other 
people in the lab would look askance =8) )? 

-- 
wiml@blake.acs.washington.edu       Seattle, Washington  | No sig under
(William Lewis)  |  47 41' 15" N   122 42' 58" W  |||||||| construction

del@thrush.mlb.semi.harris.com (Don Lewis) (08/31/90)

In article <6873@milton.u.washington.edu> wiml@milton.u.washington.edu (William Lewis) writes:
>   * Taking differences is probably much, much better than XORing, 
>since the point is to take advantage of the vertical continuity of
>the image. The problem is that it doubles the number of values the
>compressor will have to deal with -- from max->min to min->max, which
>is 2*(max-min) values. The obvious solution is to wrap around, ie, 
>use modulo(max-min) arithmetic, except that this overlays some of the
>least frequently used values (the extremes) on the most frequently used
>ones (small values), making the huffman/arithmatic coder less
>efficient.

I don't think this is a problem.  Think of the differences as signed
twos-complement values.  There should be a nice symmetric peak in
the histogram around zero.  The relatively few difference values
that get aliased in with the common difference values will just
get just as efficiently encoded as the common values.  On the other
hand, since we are encoding the data anyway, we might look at encoding
n+1 bit differences anyway if that happens to be more efficient (due
to the greater skew in distribution).

Another idea that may be help a bit is to take differences across
the rows after taking differences between the scan lines.  This will
better encode the top row as well as horizontal features.

There is one thing that bothers me, though.  The pixel data that
we are trying to compress consists of indices into a color map, correct?
It would seem to me then that the quality of the results would depend
on the arrangement of the entries in the color map.  If adjacent color
map entries map into widely separated colors and colors that are
closely related map into widely separated color map entries, we will
get poor results.  We might need to look at optimally arranging
the color map or taking the differences and compressing in RGB
(or HLS, etc.) space.
--
Don "Truck" Lewis                      Harris Semiconductor
Internet:  del@mlb.semi.harris.com     PO Box 883   MS 62A-028
Phone:     (407) 729-5205              Melbourne, FL  32901

clarke@acheron.uucp (Ed Clarke/10240000) (08/31/90)

From article <sean.651965645@s.ms.uky.edu>, by sean@ms.uky.edu (Sean Casey):
> If not, then perhaps GIF and other archivers need a ``standard'' (haha)
> interface to compression schemes. Some of the MSDOS terminal programs seem
> to provide a standard interface to file transfer protocols.

Yes, the gif standard requires LZW.  An alternative format that's
becoming somewhat popular is tif - tagged image format.  That code
contains several different compression formats and permits you to
add others.   

The author is sam@ucbvax.berkeley.edu; the mailing list may be reached
at tiff@ucbvax.berkeley.edu and tiff-request@ucbvax.berkeley.edu.  The
complete code for the library routines and xtiff is available on
ucbvax also ( for anonymous ftp ).

Tiff is supported by various word processors ( PageMaker ) although some
of the fancy compression routines are not supported.
-- 
               | "Pain, n.  An uncomfortable frame of mind that may have
Ed Clarke      |  a physical basis in something that is being done to the
acheron!clarke |  body, or may be purely mental, caused by the good fortune
               |  of another." - Ambrose Bierce

xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) (08/31/90)

I don't at all see the advantage of differences over XOR.  The color values
exist in three space, and a one bit shift in the color that happens to map
to the most significant bits of your pixel when considered for differencing
is going to map far away in difference space from the origin.  XOR has the
"advantage" that it treats all bits as of equal significance, probably a
"better" thing to do for minimizing the impact of noisy image data on the
compression efficiency.

Is there something (quite likely) that I'm not seeing here?  The previously
menitoned color table ordering problem (again, trying to linearize three
space data) only further complicates the problem of making differencing work
well.

(It may not do a lot for the XOR solution either, on first guess.)

(A solution, perhaps not the best, to the color table problem is to make
 keys from the eight bit per gun true color data that is the color table's
 output value (say, for a high quality system), then interleave the color
 bits like:

         r0g0b0...r7g7b7

 with r0 the red m.s.b., etc., and then sort these keys and use that as
 the new order for the color table.  This way the subtle changes are down
 in the r7g7b7 bits, and sort close together, while the blatant changes
 are in the r0g0b0 bits, and sort far apart.  This will help a lot, though
 the choice of red for most significant bit is still arbitrary.)

Kent, the man from xanth.
<xanthian@Zorch.SF-Bay.ORG> <xanthian@well.sf.ca.us>

meissner@osf.org (Michael Meissner) (08/31/90)

In article <1990Aug31.034936.18954@mlb.semi.harris.com>
del@thrush.mlb.semi.harris.com (Don Lewis) writes:

| In article <6873@milton.u.washington.edu> wiml@milton.u.washington.edu (William Lewis) writes:
| >   * Taking differences is probably much, much better than XORing, 
| >since the point is to take advantage of the vertical continuity of
| >the image. The problem is that it doubles the number of values the
| >compressor will have to deal with -- from max->min to min->max, which
| >is 2*(max-min) values. The obvious solution is to wrap around, ie, 
| >use modulo(max-min) arithmetic, except that this overlays some of the
| >least frequently used values (the extremes) on the most frequently used
| >ones (small values), making the huffman/arithmatic coder less
| >efficient.
| 
| I don't think this is a problem.  Think of the differences as signed
| twos-complement values.  There should be a nice symmetric peak in
| the histogram around zero.  The relatively few difference values
| that get aliased in with the common difference values will just
| get just as efficiently encoded as the common values.  On the other
| hand, since we are encoding the data anyway, we might look at encoding
| n+1 bit differences anyway if that happens to be more efficient (due
| to the greater skew in distribution).

If you are dealing with color, I think you might get better results by
separating out the RGB (red, green, blue) lines into separate copies.
Thus instead of 24 bits of pixel data, you would have a 8 bit red
pixel plane, a 8 bit green pixel plane, and a 8 bit blue pixel plane.
Thus you would probably get smaller differences by subtracting two
adjacent red pixels, then when the pixel is one 24 bit integer
particularly when you change colors.

Of course, if this is already done, just ignore this message.  I've
not dealt with any graphics formats at the bit level.....
--
Michael Meissner	email: meissner@osf.org		phone: 617-621-8861
Open Software Foundation, 11 Cambridge Center, Cambridge, MA, 02142

Do apple growers tell their kids money doesn't grow on bushes?

wiml@milton.u.washington.edu (William Lewis) (09/03/90)

In various articles with different message-Ids, several people comment that
differencing only really makes sense over XORing if you're storing actual
RGB/HSB/whoknowswhat values rather than indices into a color map. I had
completely overlooked this in my earlier post =8) Since it seems (to me)
futile (to me) to try to sort triples into any sort of ascending/descending
order, maybe it would be better to have a separate colormap for each 
"color coordinate" (I don't know what they're *really* called ... the
R or the G or the B) and sort those indices. This could up to triple the
size of the *uncompressed* data, but might produce a smaller *compressed*
image. Or, it might not. Or, it might be a bad idea for some other reason.
I've become intrigued with this discussion anyway, and have already 
started digging out arithmetic compression stuff, so if anyone sees
any flaws in my basic reasoning I'd appreciate your telling me before
I write too much unusable code =8)
  And if all goes well I suppose I'll post a trial version to alt.sources
(see, this message does have something to do with this newsgroup) in a
few thousand years. Less if anyone wants to lend a hand.
-- 
wiml@milton.acs.washington.edu       Seattle, Washington  | No sig under
(William Lewis)   |  47 41' 15" N   122 42' 58" W  |||||||| construction