[comp.sys.ibm.pc] PK ARC, *Another* View

root@hobbes.UUCP (John Plocher) (07/18/87)

FLAME ON > /usr/phil_katz/bitbucket

      The general internal layout of an ARC file is:
    
    [ 26 ][ storage ][...header info...][...data...] [ 26 ][ storage ]...

where	26 (^Z)		is a flag marking this as the beginning of an ARC entry
	storage		is a number from 0..8 which specifies how compressed
	header info	contains the filename, date, real and compressed sizes
	data		is the contents of the file named in the header

Storage types are defined (By SEA, the creator of ARC) as
	0 = End of archive
        1 = Old style, no compression (real size == compressed size)
        2 = New style, no compression (real size == compressed size)
        3 = Compression of repeated characters only
        4 = Compression of repeated characters plus Huffman SQueezing
        5 = Lempel-Zev packing of repeated strings (old style)
        6 = Lempel-Zev packing of repeated strings (new style)
        7 = Lempel-Zev Williams packing with improved has function
        8 = Dynamic Lempel-Zev packing with adaptive reset (12 bit)

    ARC implemented storage type 8 like this:

	    [ 26 ][ 8 ][...header...][ 12 ][...data...]
    
    ie. The header was extended by one char to specify the table size. 

    They intended this to mean "Use Dynamic LZW compression (type 8) with
    a 12 bit table"

    The current versions of ARC quit if the table size is anything other
    than 12, but the program SOURCE CODE can be extended to handle different
    table sizes with little effort (change a few array bounds and create a
    variable for the table size)

    Phil Katz implemented a version of ARC in asm.  It *is* fast.  He kept
    up with ARC.  He even added his own storage method.  Dynamic LZW
    compression with a *** 13 *** bit table.  He did NOT provide SOURCE CODE.
    
    How did he do it?  If you answered "By using type 8 ARC compression with
    the 12 changed to a 13",

	    [ 26 ][ 8 ][...header...][ 13 ][...data...]

    you are WRONG.  He used a NEW compression type!

	    [ 26 ][ 9 ][...header...][...data...]
		   ^^^
    If only he had implemented the 13 bit table in the way that ARC had
    intended it be added (by changing the 12 to a 13) ...

    I can only think that PK deliberately CHOSE to be incompatable with the
    standard he was tracking.  He obviously understood how type 8 ARC
    storage worked so there is no excuse for his action.

    THIS is the reason I am biased against Phil Katz's PKarc program.  Not
    because of "religous" differences about which should be considered the
    leader, not because of the speed differences, not because of the
    "easier command line interface", but because of this intentional,
    designed in, lack of adherence to the standard.

    When IBM does this we all groan and say how foolish IBM is.  We moan
    about the shortsightedness of "Marketing".  Planned incompatability
    (PC/XT/AT vs PS-2) is viewed as a futile means of cutting out the
    competition which just ends up hurting the user.  Why are things
    so different when Phil Katz does it?

echo EOT >> /usr/phil_katz/bitbucket

For what it is worth, Type 8 ARC compression is almost the same as what you
get when using "compress -b 12 file" under Un*x.  PKarc Squashing is almost
the same as "compress -b 13 file".  (I would go as far as saying that they
are the same, but I have not tried it... ARC D-LZW is based on the same
algorithm as compress.c :-)

John

-- 
John Plocher uwvax!geowhiz!uwspan!plocher  plocher%uwspan.UUCP@uwvax.CS.WISC.EDU