[comp.compression] Graph of performance of data compression programs.

ross@spam.ua.oz.au (Ross Williams) (06/21/91)

DATA COMPRESSION STATE OF THE ART GRAPH 21-JUN-1991
===================================================
I used the information posted by Peter Gutmann (peter_gutmann@kcbbs.gen.nz)
to hand plot this performance graph of compression algorithms. The graph is
only rough but gives a good idea of what is going on out there. The capital
letter of each compression algorithm is its position on the graph. I didn't
plot algorithms with no listed speed (except for SAKDC which gives the best
compression).

     0         4         8        16        32        64       128       256
     +---------+---------+---------+---------+---------+---------+---------+
     |BAD                                                                  |
   5 +                                                                     + 5
     |    Comp-1                      Pack                                 |
     |                                                            Lzrw1    |
     |                                                                     |
     |                                                                     |
   4 + Comp-1a                                                             + 4
     |           Brent                Stuffit1     Zoo2 Compress DwcPkarc  |
Bits/|Comp-2(1)                       Disdoub Pak1.00**********************|
Byte |                                          *****......................|
     |          Stuffitcl               Lharc1.13c.........................|
   3 +                        rj1       Pak2 Pkzip1.10.....................+ 3
     |                     CoApactor   Lharc2.11...........................|
     |                      ************Arj2...............................|
     |Sakdc*****************...............................................|
     |.................................................................GOOD|
   2 +---------+---------+---------+---------+---------+---------+---------+
     0         4         8        16        32        64       128       256
              Compression Speed in Kilo (1024) Bytes Per Second

****** = State of the art. The dotted zone is where we want to go.

This graph is probably misleading because some of the programs were
run on a Mac-SE and some on a PC386/25.

I am involved in designing high-performance algorithms but am finding
it difficult to work out what the state of the art is. Some algorithms
are written in C, some in machine code. Algorithms are run on PCs,
Macs, Suns, and other machines. It would be nice to standardize
somehow.

I'm starting to think that the solution to my problem is to buy a PC.
This is an alien concept as I am used to thinking of PCs in terms of
problems not solutions :-).

Has anyone out there got any sort of objective speed comparison
information for the Macintosh and PC range? I have a Mac-SE20. Maybe
we can replot the graph using some abstract CPU power measure.

Ross Williams
ross@spam.ua.oz.au

PS: The graph seems to indicate that LZRW1 is a turkey, losing out to
PKARC. I have received other indications that LZW algorithms can beat
LZRW1 on speed and compression but I can never be sure because of the
difficulty in comparing speeds. Even if it is beaten, there may be a
niche for LZRW1 because of its simplicity, its lack of patent
problems, and its low memory requirement (16K).

mycroft@kropotki.gnu.ai.mit.edu (Charles Hannum) (06/22/91)

In article <867@spam.ua.oz> ross@spam.ua.oz.au (Ross Williams) writes:

   I am involved in designing high-performance algorithms but am finding
   it difficult to work out what the state of the art is. Some algorithms
   are written in C, some in machine code. Algorithms are run on PCs,
   Macs, Suns, and other machines. It would be nice to standardize
   somehow.

I hate to pick nits, but the paragraph above confused the hell out of
me.  An 'algorithm' is written in pseudecode or a natural language;
never in a programming language.  I think we need to differentiate
between an 'algorithm' and an 'implementation of an algorithm'.  When
we start thinking of an algorithm as inseparable from its
implementation, then we get such meaningless benchmark results as the
ones you referenced.