[comp.graphics] SigGraph Fractal Compression

marcos@netcom.UUCP (Marcos H. Woehrmann) (08/07/89)

Did anyone who was fortunate enough to attend SigGraph see the Fractal
Data Compression software/hardware?  The only information I have is that
it's supposed to be able to generate amazing compression ratios and the
company is based in Atlanta.  Please mail any information (I'll summarize
if there is enough interest).

-- 
Marcos H. Woehrmann      apple!netcom!marcos  |  marcos@netcom
              "Where does he get those wonderful toys?"
The opinions expressed are not necessarily those of Apple Computer.

adam@mit-amt.MEDIA.MIT.EDU (Adam Glass) (08/07/89)

In article <2037@netcom.UUCP>, marcos@netcom.UUCP (Marcos H. Woehrmann) writes:
> Did anyone who was fortunate enough to attend SigGraph see the Fractal
> Data Compression software/hardware?  The only information I have is that
> it's supposed to be able to generate amazing compression ratios and the
> company is based in Atlanta.  Please mail any information (I'll summarize
> if there is enough interest).

I saw it. People here were a bit skeptical about the thing, and the guy who
made it was not interested in talking to people who knew about fractal
compression, only those who would buy it. What a "suit."

I would be awfully hesitant to believe this guy's claims. He seemed very
"plastic" and false as a person, which would lead me to believe his research
would follow suit.

Adam

===============================================================================
|| Adam Glass, NeXT hacker for the slave-drivers at the MIT Media Laboratory ||
||                     <insert standard disclaimer here>                     ||
||  %%              "But Wesley, what about the R.O.U.S.es?"             %%  ||
||  %%       "Rodents Of Unusual Size? I don't think they exist..."      %%  ||
||       EMail to: adam@media-lab.media.mit.edu OR adamg@athena.mit.edu      ||
===============================================================================

andrew@berlioz (Lord Snooty @ The Giant Poisoned Electric Head ) (08/08/89)

In article <444@mit-amt.MEDIA.MIT.EDU>, adam@mit-amt.MEDIA.MIT.EDU (Adam Glass) writes:
> I saw it. People here were a bit skeptical about the thing, and the guy who
> made it was not interested in talking to people who knew about fractal
> compression, only those who would buy it. What a "suit."

Michael Barnsley, author of "Fractals Everywhere" has formed a company.
If you read this book, you'll have no doubt about HIS integrity.
The company is Iterated Systems, Inc., Norcross/Atlanta, Ga.
Their most recent product (quoted in Electronic Design, Feb 23 1989):

"A fractal-based compression system... determines the iterated function
system (IFS) codes... storing these takes a few hundred bytes..until now,
finding the correct codes took a trained operator several hours on a Sun
workstation. With this new technique, data is compressed automatically,
taking about an hour. Decompression takes minutes."
-- 
...........................................................................
Andrew Palfreyman	There's a good time coming, be it ever so far away,
andrew@berlioz.nsc.com	That's what I says to myself, says I, 
time sucks					   jolly good luck, hooray!

wrf@mab.ecse.rpi.edu (Wm Randolph Franklin) (08/09/89)

In article <2037@netcom.UUCP> marcos@netcom.UUCP (Marcos H. Woehrmann) writes:
>Did anyone who was fortunate enough to attend SigGraph see the Fractal
>Data Compression software/hardware?  The only information I have is that
>it's supposed to be able to generate amazing compression ratios and the
>company is based in Atlanta.

Barnsley gave  a  talk at the  Computers  and Math conference at  MIT in
June.  He showed some pretty  pictures and recited  some nice stats.  He
refused to  give any concrete  details of how he  did  it.    One of his
opponents (I forget his name)  was also  there and strongly attacked him
in  a  separate  talk.   The   opponent claimed that   traditional image
processing compression  methods   give similar packing   ratios.   Erich
Kaltofen@cs.rpi.edu was one  of  the  organizers and so might  know  the
other speaker's name.

The most  favorable interpretation of  Barnsley is that  he is trying to
keep it secret to patent  it or apply  for grants or  to exploit it with
his  company.   A  middling interpretation  is  that the process  is not
automatic   but  requires  a lot  of hand tuning   on each picture.   An
unfavorable interpretation  is   that it only works   on  a few selected
pictures, and Barnsley is  trying to stonewall  while he generalizes the
method.  Pick one.

Personally,  I have   a  low opinion   of  people in  graphics who   try
simultaneously to get  intellectual  priority for  first discovery of an
idea  while keeping  critical details  secret so  that  no one  else can
actually use   it.  Editors should  reject teaser  papers    like these.
Grantors  should   require that all  details  resulting  from a grant be
public.   Barnsley said that some of  his  work was developed on a DARPA
grant, I think.

Please  email responses as well  as posting   since I will   be away for
awhile.


						   Wm. Randolph Franklin
Internet: wrf@ecse.rpi.edu (or @cs.rpi.edu)    Bitnet: Wrfrankl@Rpitsmts
Telephone: (518) 276-6077;  Telex: 6716050 RPI TROU; Fax: (518) 276-6261
Paper: ECSE Dept., 6026 JEC, Rensselaer Polytechnic Inst, Troy NY, 12180

stuart@rennet.cs.wisc.edu (Stuart Friedberg) (08/10/89)

adam@mit-amt.MEDIA.MIT.EDU (Adam Glass) writes:
> I saw it. People here were a bit skeptical about the thing, and the guy who
> made it was not interested in talking to people who knew about fractal
> compression, only those who would buy it. What a "suit."

andrew@berlioz (Lord Snooty @ The Giant Poisoned Electric Head ) writes:
>Michael Barnsley, author of "Fractals Everywhere" has formed a company.
>If you read this book, you'll have no doubt about HIS integrity.

A few days ago, I browsed TWO of Barnsley &co's  books, and I am still very
dubious about all this.  There are lots of neat black and white graphics
illustrating maps and the "Collage Theorem", however
(1) The synthetic images are in most cases quite poor.
(2) The compressed/reconstructed images look like paintings, not like
    the original images.
(3) There is a noticeable dearth of hard mathematics, despite Theorems
    and Exercises out the wazoo.

I wouldn't go so far as to say IFS is touting a fraud, but neither
would I hesitate to say their enthusiasm exceeds the evidence made
public so far.  Note my hedge: it's *possible* they are protecting the
details of the underlying theory as a trade secret.  After two books
full of mediocre to reasonable images, and a foreward which says "Watch
for lots of papers we're going to publish", I have some doubts.

Stu Friedberg

rich@inmet (08/13/89)

I suppose people are referring to the Michael Barnsley's demo at the AT&T Pixel
Machine booth.  Well, His work is well documented in his book "Fractals 
Everywhere" and the book "Science of Fractal Images".  Even Byte has an article
about it on early 88 or 89, I forgot which.  The basic idea is to "compress" an
image by finding the probabilistic coefficients of "attractors" equations that
when "decompress", gives you back a representation of the image.

There are several important points: one is this is not compression in the
traditional sense.  It is more like modeling.  The other one is the decompressed
image can be as good as what the output device is.  It just takes longer to
decompress if the output is larger.  Third, compression ratio can be very high.
For images with recurring parts (like a picture of a fern), it can be described
by 3 equation, each one withe 4 real number coefficient.  Thus you need to store
12 32 bits single precision floating point number.  The images he was displaying
are "real life" pcitures like a face, TV station logo, etc.  He crammed 40 or so
of the compressed data in a high density (1.2 meg) floopy, so each image is 
about 20 - 40K bytes.

Remember decompressing really means applying fractal equations over and over 
again (like in painting a MandelBort).  The amazing thing is that he was 
"decompressing" the pictures at video rate: 22 pics per second.  

His commerical products may not work/sell/ or has a market, but that to us was
certainly one of the high point of the exhibits.

sarrel@sioux.cis.ohio-state.edu (Marc Sarrel) (08/14/89)

In article <20400001@inmet> rich@inmet writes:

   The basic idea is to "compress" an image by finding the probabilistic
   coefficients of "attractors" equations that when "decompress", gives
   you back a representation of the image.

   There are several important points: one is this is not compression in
   the traditional sense.  It is more like modeling.  The other one is
   the decompressed image can be as good as what the output device is.
   It just takes longer to decompress if the output is larger.  Third,
   compression ratio can be very high.  For images with recurring parts
   (like a picture of a fern), it can be described by 3 equation, each
   one withe 4 real number coefficient.  Thus you need to store 12 32
   bits single precision floating point number.

Well, this is the best explanation that I've heard so far, but I'm
still not convinced of the method's overall usefullness.  First, how
does one go about finding those coefficients?  Is it an automatic
process, or does it have to be done "by hand?"  How does this method
compare to "standard" compression schemes in terms of speed and
compression ratio?  What is the method's average and worst case image
behaviour, you've already given us its best case behaviour?  What
sorts of images does it work least well on?  I could go on, but I have
a life to lead...

As far as the getting more resolution out than you put in, surely you
realize that this is nonsense.  You are certainly able to do the
calculations, but the added information is just arbitrary and won't
necessarily have anything to do with the original information (I don't
think).  In order to reproduce the image at the original resolution,
you only need a certain amount of "accuracy" for your coefficients.
(ie:  anything after a certain decimal place is noise.)   However, it
is just this extra information that you're using to "add" information
to the image.  There ain't no free lunch.

(setf soap-box-mode t)

I'm sorry, this whole thing smells too much like the "Confusion is a
Jar" (tm) debacle that wasted our time a few months ago.  I don't
trust any scientist or researcher who makes wild claims, but then
won't or can't disclose all his reasoning and evidence.

(setf soap-box-mode nil)

--marc
-=-
"Master, why is the letter 'i' the symbol for current?"  "Because there is
no letter 'i' in the word 'current'."  "Master, why do we use the letter
'j' for sqrt(-1)?"  "Because we use the letter 'i' for current."  Whereupon
the Master struck the Disciple, and the Disciple became enlightened.

spencer@eecs.umich.edu (Spencer W. Thomas) (08/15/89)

The technique is valid, but it is not lossless.  In other words, what
you get out ain't what you put in.  For a noisy image (i.e., a
digitized image), this may be ok.  For other images (i.e., a computer
generated image with an alpha channel), this is not good at all.
Finally, compression is SLOW.  I recall hearing a figure of 1 hour per
image to compress (I could be wrong).  Thus, the technique is only
really useful if you are going to decompress much more frequently than
you will compress.

In these respects, it is very similar to another compression technique
that promises extremely high compression ratios (100:1 and better):
vector quantization.  VQ is slow to compress, fast to decompress, and
lossy.

In other words, TANSTAAFL.

--
=Spencer (spencer@eecs.umich.edu)

mherman@alias.UUCP (Michael Herman) (08/15/89)

According to one of the fellows at the booth, the SIGGRAPH demo
(decompression) was entirely software-based - however, the product they
intend to take to market will be hardware based (i.e. they don't
intend to market the software).

gors@well.UUCP (Gordon Stewart) (08/16/89)

It seems to me that this is an ENCODING technique -- one that does not
preserve 100% of the information input.

Am I wrong?

-- 
				{apple, pacbell, hplabs, ucbvax}!well!gors
							gors@well.sf.ca.us
(Doolan) | (Meyer) | (Sierchio) | (Stewart)

rich@inmet (08/16/89)

> Well, this is the best explanation that I've heard so far, but I'm
> still not convinced of the method's overall usefullness.  First, how
> does one go about finding those coefficients?  Is it an automatic
> process, or does it have to be done "by hand?"  How does this method

	Well, this is the "secret" that Barnsley is protecting.  It usually is
done in a brute force manner.  However, it used to take perhaps 100+ hours
to find the encoding, but he claims his new method takes perhaps minutes to
several hours.

	This is not a standard compression scheme.  The difference is instead of
compressing the bitmap, you are giving instruction of how to draw the image
using some standard shapes: in CAD areas, it may be circles and squares etc,
for "natural" images, you use fractals.  SO depending on how many fractal
equations you need, that's your compression ratio.

> As far as the getting more resolution out than you put in, surely you
> realize that this is nonsense.  You are certainly able to do the
> calculations, but the added information is just arbitrary and won't
> necessarily have anything to do with the original information (I don't
> think).  In order to reproduce the image at the original resolution,
> you only need a certain amount of "accuracy" for your coefficients.

	What I meant is you can reproduce your image at any resolution.  The 
fractals equations are resolution independent.  So this scheme works as an image
enlarger/reducer.

	I have my doubt whether we can really encode images in real time w/ this 
technique.  However, I think this technique is very useful for archiving images
and in situations that transmission of images is more expensive (such as say
from a space probe).  BTW, the system is more error "resistant".  That is, a 10%
error on data yeilds less than 10% error on image.

Kenneth.Maier@branch.FIDONET.ORG (Kenneth Maier) (08/17/89)

Spencer,
  I seem to have jumped in during the middle of a conversation but are
you referring to IFS (Iterated Function Systems)?  
  
  -Kenneth


--  
Kenneth Maier via FidoNet node 1:369/11
UUCP: {attctc,<internet>!mthvax}!ankh,novavax!branch!Kenneth.Maier

mfinegan@uceng.UC.EDU (michael k finegan) (08/17/89)

rich@inmet writes:

>I suppose people are referring to the Michael Barnsley's demo at the AT&T Pixel
>Machine booth.
~
~
~
>Remember decompressing really means applying fractal equations over and over 
>again (like in painting a Mandelbrot).  The amazing thing is that he was 
>"decompressing" the pictures at video rate: 22 pics per second.  

    I have seen the Pixel Machine ads: "up to 820 MFLOPS ...". Is the fact that
he could decompress quickly on a Pixel Machine meaningful? Wouldn't you have to
know how many DSP32's (~10 MFLOP) were being used (i.e. how many MFLOPS) and
then talk about how many MFLOPS/(row X col picture) to decompress? At 820
MFLOPS/sec the demo you saw could have been 37 MFLOP/picture to decompress
(size of images?), which might be a higher degree of complexity than for
discrete cosine transform, etc.

    For comparison, the AT&T DSP Review (Winter '89) shows an example of the
use of a DSP16A (integer based ~ 30 MIPs) to perform DCT of full-color (24
bits/pixel) images, with compression of 32:1. That would transform a 3/4
Meg input file (512 X 512 X 24bpp) into ~24K output file. The transformations
(compress or decompress) each take 1/2 second (at ~30 MIPS).

    While MIPS and MFLOPS are different animals, the DSP16's could also be
used in parallel, and at 820 MIPs the DCT methodology would yield ~54
pics per second (compress or uncompress). This makes the fractal decompression
look pretty good (1/2 performance of current methods); what about fractal
compression? I was under the impression that encoding the image using 'fractal
analysis' took many orders of magnitude longer than the decoding ...

				      -	Mike Finegan

					finegan@aicv01.ece.UC.EDU
					mfinegan@uceng.UC.EDU

schear@ttidca.TTI.COM (Steve Schear) (08/18/89)

In article <13147@well.UUCP> gors@well.UUCP (Gordon Stewart) writes:
>It seems to me that this is an ENCODING technique -- one that does not
>preserve 100% of the information input.
>
>Am I wrong?
>
No, you are correct.  Fractal compression attempts to vectorize a continuous image by identifying objects or segments of objects which can be coded with coefficients of an affine transform.  It works well on most "natural" objects, since they are generally contructed from processes which are recursive (e.g., cell division, crystal growth) or operate locally (e.g., weathering).

Natural objects have been found to display a great deal of fractional dimensional similarity over several orders of magnitude.  So, once you can identify a natural object in an image, for which you have already determined its afine transform and other IFS (Iterated Function System) information, you can (with relatively good accuracy) render that object at various resolutions, even ones which greatly exceed the resolution of the original image capture system.

One may argue, and quite correctly, that these reconstructed images are not identical to the original images (esp. those which have enhanced resolution).  I think that they are more akin to an artist's, abeit a very detail oriented artist's, rendition of the original scene.  This may in many applications be more than sufficient.

mdeale@cosmos.acs.calpoly.edu (Myron Deale) (08/18/89)

In article <1934@uceng.UC.EDU> mfinegan@uceng.UC.EDU (michael k finegan) writes:
>rich@inmet writes:
>>Remember decompressing really means applying fractal equations over and over 
>>again (like in painting a Mandelbrot).  The amazing thing is that he was 
>>"decompressing" the pictures at video rate: 22 pics per second.  
>
>    While MIPS and MFLOPS are different animals, the DSP16's could also be
>used in parallel, and at 820 MIPs the DCT methodology would yield ~54
>pics per second (compress or uncompress). This makes the fractal decompression
>look pretty good (1/2 performance of current methods); what about fractal
>compression? I was under the impression that encoding the image using 'fractal
>analysis' took many orders of magnitude longer than the decoding ...
>				      -	Mike Finegan
>					mfinegan@uceng.UC.EDU

   Hey, not to mention the Inmos A300 (or some such) DCT chip. Supposed
to be rated at 320 MIPS, because it has something like 8 mult/acc's on-chip.
I can't remember if I read it right or not, but they may have mentioned
performing the transform fast enough to handle real-time video rates,
eg. 20MHz (chip clock rate). Cost was around $80 (1000's) expected to go
down to $25.

   Not to put down the "fractal compression" technique, but it seems to
need more development, and be more open to public and/or scientific
scrutiny. On the flip side, DCT is a known.


-Myron
// mdeale@cosmos.acs.calpoly.edu

chucko@saturn.ucsc.edu (Chuck Stein) (08/18/89)

	I have heard no talk of fidelity of the reconstructed fractally
compressed images.  Has Barnsly or anyone else compared reconstructed
images with those produced using other techniques such as DCT.  I haven't
seen the latest pictures, but the old ones look quite impressionistic.
MSE-based signal-to-noise ratio fidelity metrics are poor when comparing
across different compression techniques, but subjective rating-scale
tests or psychophysical tests can be performed to compare between methods.
What kind of fidelity measures do they talk about (if they do)?
	
	Chuck Stein	{chucko@saturn.ucsc.edu}

peter@mit-amt.MEDIA.MIT.EDU (Peter Schroeder) (08/20/89)

In article <8770@saturn.ucsc.edu> chucko@saturn.ucsc.edu (Chuck Stein) writes:
>
>	I have heard no talk of fidelity of the reconstructed fractally
>compressed images.  Has Barnsly or anyone else compared reconstructed
>images with those produced using other techniques such as DCT.  I haven't
>seen the latest pictures, but the old ones look quite impressionistic.
>MSE-based signal-to-noise ratio fidelity metrics are poor when comparing
>across different compression techniques, but subjective rating-scale
>tests or psychophysical tests can be performed to compare between methods.
>What kind of fidelity measures do they talk about (if they do)?
>	
>	Chuck Stein	{chucko@saturn.ucsc.edu}


I saw some of the latest images at the conference Computers and Mathematics
and must say I am very impressed! I also thought the earlier ones looked
just like painters' impressions  of the original photograph. This time he
showed among other things a photo of himself and the fractally
reconstructed version of it. The only difference that I could detect, was
an ever so slight blurring, which they might have added after the
reconstruction itself. Supposedly it was a compression ratio of 1:128 which
is as good as the best compression techniques for movies (!) that I know
about. Since the fractal compression is still in it's infancy I suspect
they might do even better, while the other techniques have to push very
hard now to reach anything above 1:100...

The main question that remains for me at this point is, whether it will
work with sequences of images, that is over a 2D x time domain. It turns
out that it is nontrivial to make a technique, which works for a single
picture, work for a sequence of pictures without artifacts which only
become visible in the sequence.

Peter

peter@media-lab.media.mit.edu

bpendlet@bambam.UUCP (Bob Pendleton) (08/22/89)

From article <1934@uceng.UC.EDU>, by mfinegan@uceng.UC.EDU (michael k finegan):
> rich@inmet writes:

>     For comparison, the AT&T DSP Review (Winter '89) shows an example of the
> use of a DSP16A (integer based ~ 30 MIPs) to perform DCT of full-color (24
> bits/pixel) images, with compression of 32:1. 

DCT? Could some kind person post a definition of DCT? A brief
description of the basic principles would be nice as would a pointer
to a survey article on the subject. 

		Thanks

			Bob P.
-- 
              Bob Pendleton, speaking only for myself.
UUCP Address:  decwrl!esunix!bpendlet or utah-cs!esunix!bpendlet

           Reality is stanger than most people can imagine

schear@ttidca.TTI.COM (Steve Schear) (08/22/89)

In article <20400002@inmet> rich@inmet writes:
>
>
>	What I meant is you can reproduce your image at any resolution.  The 
>fractals equations are resolution independent.  So this scheme works as an image
>enlarger/reducer.
>
Well, while it is true that the fractal algorithms can generate images over
any arbitrarily large range of scale, the resulting images may bear little
resemblance to the objects being modeled.  Work by Alex Pentland has shown
that most natural images exhibit fractal similarity generally over a range
 of between 4-8 to one.  That is, knowing the fractal dimension of object, one
can, with some degree of confidence, predict the fractal dimension of that 
same object at (for instance) one fourth the known dimension.  
Zooming to greater magnifications will typically yield only "imaginative" 
results.

mitchell@cbmvax.UUCP (Fred Mitchell - QA) (08/24/89)

I am very interested in learning about these fractal image compression
algorithms. Can you point me in the right direction as to what books,
articles, etc. that I should delve into? Thanks.

-- 

                                   |*******************************************|
	-Compliments of	       /// |* All thoughts and comments are soley     *|
	 Fred Mitchell	   \\\///  |* thoses of The Author and has nothing to *|
			    \XX/   |* do with Commodore-Amiga.		      *|
   Software QA - Commodore-Amiga   |*******************************************|

ashok@atrp.mit.edu (Ashok C. Popat) (08/24/89)

In article <127@bambam.UUCP> bpendlet@bambam.UUCP (Bob Pendleton) writes:
>
>DCT? Could some kind person post a definition of DCT? A brief
>description of the basic principles would be nice as would a pointer
>to a survey article on the subject. 

Transform coding is discussed at length in:

N.S. Jayant and P. Noll, _Digital Coding of Waveforms: Principles and
Applications to Speech and Video_, Prentice-Hall, Englewood Cliffs, N.J.,
1984, Chapter 12.

The seminal paper on the DCT is:

N. Ahmed, T. Natarajan and K.R. Rao, "Discrete Cosine Transform,"
IEEE Trans. on Computers, pp. 90-93, Jan. 1974.

The following two papers are on related topics:

H.S.  Malvar and D.S. Staelin, "Reduction of Blocking Effects in
Image Coding with a Lapped Orthogonal Transform," Proceedings from
International Conf. on Acoustics, Speech, and Signal Processing,
New York, April 11-14, 1988.  Also see related article by the same
authors in the Feb '89 IEEE Trans. on Commun.

J.W. Woods, S. O'Neil, "Subband Coding of Images," IEEE Trans. Speech,
Acoust., Signal Processing, Oct. 1986, pp. 1278-1288.

Now here's my attempt to explain the DCT without going in to much math:

DECORRELATION AND ENERGY-COMPACTION:

Suppose we have an image that consists of 512 by 512 pixels, where
each pixel takes on a continuous real value.  Our goal is to digitize
the image efficiently --- i.e., represent it using not more than X
bits with as great fidelity as possible.

Think of our image as being comprised of contiguous,
mutually-exclusive tiles or blocks, each block consisting of, say, 8
by 8 pixels.  Hence our 512 by 512 image contains 4096 such blocks.
Each block can be regarded as a vector in 64-dimensional space.  Now
if we were to do a statistical analysis of the components of the
blocks, we would find that there is some correlation between
components, and that all components have about the same variance.  In
order to digitize the components efficiently, we'd have to use a
relatively complicated scheme that takes advantage of the correlation
between components in each block, and we'd probably want to use the
same quantization scheme for all of the components in a block.

An alternative is to apply an energy-preserving transformation that
gets rid of the correlation between components prior to digitization.
A necessary consequence of the energy-preserving decorrelation of
components is the "compaction" of energy --- the components no longer
have the same variance, although the *sum* of the variances is the
same as before.  In fancy language we are saying that the
transformation is unitary and that it diagonalizes the autocovariance
matrix.

The fact that the components no longer have the same variance allows a
greater number of quantization bits to be allocated to the components
that "need" them most, or alternatively, it allows the "unimportant"
components to be simply discarded. And since the components are now
uncorrelated, little would be gained by using anything other than
scalar (Lloyd-Max) quantization to accomplish the digitizing.  This is
the basic idea behind transform coding.

THE TRANSFORMATION:

If we assume that the image is well-modeled as a wide-sense stationary
random process (often assumed for simplicity in analysis), then the
transformation that achieves perfect decorrelation is the
Karhounen-Loeve transform (KLT).  The columns of the forward KLT
matrix are simply the eigenvectors of the autocovariance matrix for
the blocks.

The KLT is, in general, difficult to compute.  This is where the DCT
comes in.  The rows of the DCT matrix are simply sampled cosines, and
hence it is easy to compute.  The DCT has been found to give very good
energy-compaction in image coding.  Also, when a first-order
Gauss-Markov model is used to describe images (where only
adjacent-pixel correlation is accounted for), the performance of the
DCT approaches that of the KLT as the correlation approaches unity.

THE LOT AND QMFs:

For a given number of transform coefficients, greater
energy-compaction can be achieved by allowing the transformation
domain to extend beyond the block boundaries. Also, the problem of
"block artifacts" (the visibility of the block boundaries resulting
from abrupt changes in the way the coefficients are quantized from
block to block) is mitigated in this way.  This leads to the
Lapped-orthogonal transform (LOT) and quadrature-mirror filter (QMF)
banks.  When the latter is employed, the term "transform coding" is no
longer used; rather, "subband coding" is used.  QMFs go back to 1977
and LOTs to 1984, but their application to image coding is only
beginning to become widespread.

There are a number of variables in a DCT coder, including the size of
the blocks, whether the coder is adaptive, the manner in which the
coefficients are quantized, and whether the rate is fixed or variable.
Also, it should be mentioned that there are two varieties of DCT,
even-symmetrical and odd-symmetrical, but the former is nearly always
used.

Ashok Chhabedia Popat   MIT Rm 36-665 (617) 253-7302

chris@well.UUCP (Chris Sears) (08/26/89)

In article <13785@bloom-beacon.MIT.EDU>, ashok@atrp.mit.edu (Ashok C. Popat) writes:
> In article <127@bambam.UUCP> bpendlet@bambam.UUCP (Bob Pendleton) writes:
> >
> >DCT? Could some kind person post a definition of DCT? A brief
> >description of the basic principles would be nice as would a pointer
> >to a survey article on the subject. 
> 

Indeed, ashok@atrp.mit.edu was kind.  Are there any other kind persons
who have some DCT code that I can ftp?

--Chris Sears

E-mail:     {apple,hplabs,pacbell,ucbvax}!well!chris
            chris%well.uucp@lll-crg.llnl.gov
Phone:      1.415.826.2781
Address:    1921 23rd Street
            San Francisco, CA 94107