[comp.graphics] Data Compression

greid@ondine (Glenn Reid) (11/17/87)

As I understand it, 3-to-1 compression rates are close to optimal unless you
can take advantage of the "context" (i.e. knowing that there may be many
sequential frames that are essentially the same).

The way compression schemes work, in general, is to take N bits of data and
represent the pattern in M bits (hopefully N > M).  For example, a run-length
encoding can represent a run of white bits by just encoding the number of
white ones in a row, instead of representing each of them individually.

An interesting variation on this is to represent bitmaps as hexadecimal.  This
is, in fact, a 2-to-1 loss (not really a compression scheme), but it has
the property that there is an external "table lookup" mechanism implied by
the compression scheme.  In particular, the ASCII character code system is
used to represent the hex digits, which are then interpreted as base 16
numbers instead of characters.  The "table lookup" is implicitly the
ASCII character set.

I started thinking about using, say, the Macintosh Screen Fonts as a kind of
table for bitmaps.  A bit pattern could be represented by a single byte if
it happened to match a screen font character in a system font (modulo the
problem that someone could edit the fonts).  Depending on the font, this
could be quite a substantial compression scheme.  But it suffers from being
"context-dependent" in the same sense that the previous examples were, since
it would rely on a Macintosh-specific resource.

But what if there were an ISO standard "table" of bitmaps, encoded much like
the ASCII character set.  Compressed bit images could be represented as
indices into the standard table, rather than being self-sufficient.  I believe
that the only way to dramatically improve compression is to rely on an
external context or external data of some kind, rather than trying to use
an algorithm to compress the data.

Is this feasible?  My information theory is a little weak; perhaps someone
could either run with this or shoot it full of holes so I can go back to
thinking about something else.

Glenn Reid
Adobe Systems

jpm@beta.UUCP (James P Mcgee) (11/17/87)

   There were a couple of messages about compression this past week that
were related, but the relation was not obvious.  Spencer Thomas mentions
the vector quantization methods of Tom Stockham.  Later, Glenn Reid
of Adobe describes an idea for a compression mechanism.  The idea that
Glenn came up with is very similar to Stockhams vector quantization.  The
difference is that, while Glenn suggests using standard tables of bitmaps,
Stockham uses tables of bitmaps generated from the picture.  Including
the bitmaps doesn't increase the size of the compressed image appreciably,
and the results are better.  My complements to Glenn.  Even though it is
a little too little and a little to late to be first, it's definitely a
creative idea.

   As an aside, I met Tom Stockham when he gave a talk for Hank Christiansens
MOVIE.BYU training program.  In his talk, Tom derived the theoretical 
minimum number of bits necessary to encode any image, which is 70 bits.
This is 70 bits for the entire image, regardless of size, resolution, or
content.  The derivation is interesting, humerous, and gives absolutely no
guidance on how to implement such a scheme.

Pat McGee, jpm@lanl.gov

paul@hpldola.HP.COM (Paul Bame) (11/19/87)

> 
> But what if there were an ISO standard "table" of bitmaps, encoded much like
> the ASCII character set.  
> 
> Glenn Reid
> Adobe Systems
> ----------
This is beginning to sound like the various "picture" protocols/languages
like NAPLPS.

		--Paul Bame

llh@egg-id.UUCP (11/20/87)

> An interesting variation on this is to represent bitmaps as hexadecimal.  This
> is, in fact, a 2-to-1 loss (not really a compression scheme), but it has
> the property that there is an external "table lookup" mechanism implied by
> the compression scheme.  In particular, the ASCII character code system is
> used to represent the hex digits, which are then interpreted as base 16
> numbers instead of characters.  The "table lookup" is implicitly the
> ASCII character set.
> 
> I started thinking about using, say, the Macintosh Screen Fonts as a kind of
> table for bitmaps.  A bit pattern could be represented by a single byte if
> it happened to match a screen font character in a system font (modulo the
> problem that someone could edit the fonts).  Depending on the font, this
> could be quite a substantial compression scheme.  But it suffers from being
> "context-dependent" in the same sense that the previous examples were, since
> it would rely on a Macintosh-specific resource.
> 
> Glenn Reid
> Adobe Systems

  This is exactly what I am currently tring.  I have 512 by 512 by 8 bits
per pixel images that I am hardcopying (monochrome).  I had been using
an imagen printer with its network interface.  The technique I used was to
map each source pixel to one of 17 4 by 4 bitmaps, each populated with from
0 to 16 `on' bits.  This allows quantizing the image into 17 gray shades but
still yields 75 hardcopy pixels per inch. (The imagen supports 300 dots
per inch).

  This worked great with the high speed network interface.  But the imagen
has been replaced by an Apple Laserwriter connected on a Sun3 via a 9600
baud serial line.  Converting the 8 bit source pixel to a 16 bit bitmap
doubles the data.  Now the Apple speaks postscript which needs the bitmaps
encoded in ascii, another doubling.  One megabyte at 9600 baud is too
slow.

  Now for the kicker.  Maybe someone can help me fix my bug.  I don't
know a thing about postscript.  I have the manuals on order from Adobe,
but procurement here is slow.  I'm working blind.

  With the sun software came a little demo postscript program that created
2 `characters' in a font and printed them out various ways.  I modified the
program to create the 17 bitmaps I had and called them [0-9,a-g].  Now I 
should be able to map the image into the character set and `print' it with
7 bits (one character) per pixel with a little overhead for the `carriage
return/linefeed'.

  Well this works until about two thirds the way through a 512 by 512 image.
The printer then reports back a `VMerror'.  Is this a virtual memory error?
Is there some resource that I need to release (per line?)?

  Any pointers or comments would be appreciated.  Sorry, I believe the source
postscript is proprietary, so I can't post the code.
-- 
Linn Hower	lh1@INEL.GOV		Phone: 208-526-9353
		lh1%INEL.GOV@uiucuxc.ARPA

jru@etn-rad.UUCP (John Unekis) (11/21/87)

In article <3387@adobe.COM> greid@adobe.UUCP (Glenn Reid) writes:
>The way compression schemes work, in general, is to take N bits of data and
>represent the pattern in M bits (hopefully N > M)....
>But what if there were an ISO standard "table" of bitmaps, ...
As my wife always says-"Great minds run in the same gutters." This scheme is
very much like what the CCITT used for group III and IV compression . They
have a table of bit strings, each of which is unique and non-repeating. The
shortest bit strings are used to "encode" the statistically most common
run-lengths of bits. This scheme is called Huffman encoding. It achieves
terrific compression on one-bit-deep images, but it may produce negative
compression if the image is noisy, and it is not useful for 8-bit-deep
gray-scale images. It has been implemented in hardware in the AMD7970.

jbn@glacier.STANFORD.EDU (John B. Nagle) (11/22/87)

     The basis of video compression is the notion that each frame is usually
a lot like the last one.  There are various schemes for sending the changes.
I've seen images transmitted using commercial gear which compresses a video
signal to fit into a 56KB line.  The system really works only for mostly
static images; whenever anything moves, the area of change is shown as
large squares of uniform density and color.  As soon as things begin to
stabilize, the squares are subdivided into smaller and smaller squares
until each uniform area occupies only a single pixel.  This process
takes about a second after a major change in the image, less for
smaller changes.  A person talking without moving their body too much
looks almost normal.  At the other extreme, panning the camera produces
complete breakup.

     Considerable work is underway on finding better ways to compress
video.  One motivation behind this is the desire for a better approach
to high-definition television.  The proposed scheme from Japan requires
three times as much bandwidth as present television signals, and this is
unacceptable for broadcast.  As yet, RAM still isn't cheap enough to 
allow a frame buffer in every TV set, (although some high end units have
one now), but it is recognized that that day is not far away, and 
work on compression goes on at the MIT Media Lab and at consumer electronics
companies.

					John Nagle

lipton@m2c.ORG (Gary Lipton) (12/09/87)

I'm looking for some general references on data compression. 
I'd like to know about any useful survey articles or texts.
Thanks.

Gary Lipton - Massachusetts Microelectronics Center    ##   ## 2 #####
lipton@m2c.org (CSNET and Internet with MX support)    ### ###  ##    
lipton%m2c.org@relay.cs.net	         (Internet)    ## # ##  ##
{harvard,bu-cs,frog,ulowell}!m2c!lipton      (UUCP)    ##   ##   #####

truett@cup.portal.com (12/12/87)

lipton@m2c.org asks for good references on data compression.

Try the text "Data Compression: Techniques and Applications" by Thomas J.
Lynch.  Lifetree Learning Publications, Belmont, CA.
ISBN 0-534-03418-7 or LC# QA76.9.D33L96 1984.

A very good book.

Truett Smith, Sunnyvale, CA
uucp: truett@cup.portal.com