[comp.binaries.ibm.pc.d] Archiver performance

davidsen@steinmetz.ge.com (Wm. E. Davidsen Jr) (03/30/89)

Here is an article I wrote for the _CAMS News_ describing some
measurements I made of the various archivers around. The test conditions
are described and you are free to make any tests you want, show
different results, and post them if you like.

A more complete form of this is on a BBS, containing more tables, lists,
and other boring detail of interest only to those who might want to
repeat the tests.
  System	*IX BBS
  Phone		518-346-8033
  login		bbs
  board		mbs
  area		archivers
________________________________________________________________
                          The PC Archiver Wars

  Two years ago I reported that there was a "war" for the title of best
archive program between three programs called "pkarc," "zoo," and "dwc."
Since then there are new contenders, new versions, and it's time to
cover it again.

What's an archiver?
===================

  An archiver is a program which allows you to combine a number of files
into a single file, and to compress the data to make it samller. This is
not only convenient, it saves a lot of disk space, not only in the total
size, but because MS-DOS wastes some space for each file, and if you
have only one file you waste less.

A Brief History
===============

  Ten years ago, CP/M users had programs which made a file smaller,
such as SQ, and programs which put a number of files into a single
file, like LBR and FARK. About six years ago a program called
"compress" was introduced to the UNIX world, to reduce the cost of
shipping messages around the world. About five years ago Thom
Henderson, doing business as SEAware introduced a program called ARC.
It knew about a number of compression algorythms, and tried several on
a file before using the best one to compress the data. This was
somewhat of a breakthrough, and it became the standard for distribution
of compressed data.

  About a year later, Phil Katz, calling himself PKware, released a new
archiver called PKARC, which created and read ARC format files far
faster than the original ARC program. Steve Manes "Magpie" BBS was a
hotbed of activity, with Rahul Dhesi releasing new version of ZOO, Dean
W. Cooper releasing new versions of DWC, and Phil Katz releasing new
versions of PKARC.

  In the last two years SEAware has had a lawsuit with PKware. The two
companies, about the same size, were contending for the same shareware
market, and SEAware felt that PKware had infringed some of its
conprights and stolen some of the code (SEAware made the source code to
ARC available, so it could be used on non-DOS systems). At the start of
the trial reports say that an expert witness testified after examining
the source code to both programs, and PKware settled out of court by
giving up essentially all of the rights and revenues of PKARC, and
promising not to ever make a program which created ARC format files.

What's Available Now
====================

  In addition to ARC and old copies of PKARC, Phil Katz is back with a
program called PKZIP, ZOO is at v2.01, the two year old version of DWC
is still competitive, a new program called GSARC from NoGate Consulting
is available, and a program called LHARC by Haruyasu Yoshizaki has
recently become available. Each of these programs offers a mixture of
speed, effiency of compression, and portability.

Benchmarking the Programs
=========================

  I tested these programs on a 16MHz 386 running MS-DOS v3.20. I tried
compressing a bunch of executable files (programs for viewing GIF
images), text files (documentation for the exe's), and partially
compressed binary data (GIF images).

  The versions were: ARC v5.21 (has anyone seen 6.0 yet?), PKARC v3.5,
also tested with the "-oc" option to produce a portable output file,
PKZIP v0.90 using the "do it fast" and "do your best" options, DWC
vA4.9, GSARC v1.0, and LHARC v1.0e.

  None of the archivers got more than 1% compression on the collection
of GIF files, so I won't report the results in this table. The
executable files were ten files totalling 336341 bytes, the
documentation was 7 files totalling 161348 bytes

             Results for Executable Files and Documentation

		-- executables --       - documentation -
Program		sec	bytes	%	sec	bytes	%
arc		59.48	266898	21	24.31	72809	56

pkarc -oc	19.06	258147	24	 4.25	73212	55
pkarc		22.70	258147	24	 3.66	70066	57

pkzip		14.49	256554	25	 3.67	70307	57
pkzip -ea4 -eb4	31.28	215511	37	11.04	67105	59

zoo		26.47	261511	22	 6.63	71459	56

dwc		19.35	259052	24	 5.57	69929	57

gsarc		34.87	238823	30	12.53	69075	58

lharc		43.82	200959	40.4	20.61	58541	63.9
================================================================

                             Other Features

program		source	portability	shareware

arc		yes	V Good		request
pkarc		no	DOS only	request
pkzip		no	DOS only	request
zoo		yes	Excellent	no
dwc		yes	DOS only(1)	no
gsarc		no	DOS only	yes (2)
lharc		no	DOS only	no

1) Should port easily to UNIX, Max, AmigaDOS. I had a UNIX version which
   pretty much worked.
2) No language demanding that you stop using if you don't pay. Regular
   version is $15, full screen version (real nice) is $30, programmers
   library of data compression routines is $50.
================================================================

  PC users have a lot of choices, it's hard to find a bad program among
this bunch. The war is still on and I expect to see new programs coming
out. ALl of the archivers seem to be based on Lempil-Ziv-Welch
compression so far, but there is supposed to be one based on splay trees
available soon.

  These programs and the text of this article with tables of files
tested, the GIF compression results, and some other data are available
from *IX BBS, in the "archivers" section of the MBS board (login as
bbs, select mbs on the BBS menu at login).


-- 
	bill davidsen		(wedu@crd.GE.COM)
  {uunet | philabs}!steinmetz!crdos1!davidsen
"Stupidity, like virtue, is its own reward" -me

las) (03/31/89)

In article <13488@steinmetz.ge.com> davidsen@crdos1.UUCP (bill davidsen) writes:
>Here is an article I wrote for the _CAMS News_ describing some
>measurements I made of the various archivers around...

>  None of the archivers got more than 1% compression on the collection
>of GIF files, so I won't report the results in this table...

Thanks for your updated report, Bill.  I have no flame, no complaint -
just a comment about GIF files: it's not surprising that they don't
compress well, I understand that they are (or rather, the raster data is)
already compressed (using Lempel-Ziv, I think).

regards, Larry
-- 
Signed: Larry A. Shurr (att!cbnews!cbema!las or osu-cis!apr!las)
Clever signature, Wonderful wit, Outdo the others, Be a big hit! - Burma Shave
(With apologies to the real thing.  The above represents my views only.)

davidsen@steinmetz.ge.com (Wm. E. Davidsen Jr) (04/01/89)

In article <5255@cbnews.ATT.COM> cbema!las@cbnews.ATT.COM (Larry A. Shurr) writes:
	[ ... ]
| just a comment about GIF files: it's not surprising that they don't
| compress well, I understand that they are (or rather, the raster data is)
| already compressed (using Lempel-Ziv, I think).

  Yes. I think GIF is 12 bit LZ, I thought I'd check to see if it could
be improved. If you get the full report you will note that the files got
much bigger. I actually got some compression on LZW output using an
adaptive Huffman coder I write some time ago. Not readily portable to
the PC world, but I did get about 20% (using a Cray2 for 64 bit
arithmetic).

  ZOO was the clear winner for GIF files... using the 'f' option it was
faster than any other and produced the smallest file by quite a bit.
-- 
	bill davidsen		(wedu@crd.GE.COM)
  {uunet | philabs}!steinmetz!crdos1!davidsen
"Stupidity, like virtue, is its own reward" -me

deng@shire (Mingqi Deng) (04/01/89)

In article <13488@steinmetz.ge.com> davidsen@crdos1.UUCP (bill davidsen) writes:
>Here is an article I wrote for the _CAMS News_ describing some
>measurements I made of the various archivers around...

Just want to make two additional comments from my testing with LHARC
and ZIP:

   LHARC's performance is consistent when used with large and small
files, which is unfortunately not the case with PKZIP. Some big "tough"
files that ZIP performed worse than PKPAK (using the optimal option -ex
in PKZIP 0.92) are handled by LHARC well, too.
   
   LHARC's usually achieves a 10% extra compression than ZIP does. This
means LHARC is about 25% better than PKPAK in compression ratio.

Mingqi