[net.micro.atari16] Printable Character Archiver

csc@watmath.UUCP (Computer Sci Club) (11/06/86)

I (with some others) am in the process of building a printable character
archiver, called "earthpig" (it's an aarchiver, :-)).  The following is an
example of the compression its printable character encoding algorithm gets.

The "test" file is /bin/vi.  I have shown the compression ratios for
the various interesting files.  A file ending in "pig" is an earthpig
encoded file.  A file ending in "uue" is a uuencoded file.  A file ending
in 'Z' is a "compress"ed file.

	test		131338
	test.Z		 70103	(0.533 of test)
	test.pig	143087	(1.090 of test)
	test.uue	180979	(1.378 of test)
	test.pig.Z	 73175
	test.uue.Z	 94691
	test.Z.pig	 96698	(0.736 of test)
	test.Z.uue	 96611	(0.735 of test)

For compressed data, it does about as well as uuencode.  For uncompressed
data it's quite a lot better.  Small binaries (under 20K) shrink slightly.
C programs and text files shrink slightly or stay about the same.

Earthpig uses only the characters

	+-abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ
	@.,;:=?*"'/!()_%&

This character set should make it through almost anything unchanged.  The
algorithm only uses table lookup:  no bit masking or shifting.

When we finish the archiver we will post various versions of it to the
net.

What it does:

	- can generate correction requests from errors
	- can generate patches from correction requests
	- CRC checking on two levels
	- supports os independent hierchical file names
	- high immunity to format changes and noise characters (space/control)
	- close to 1:1 encoding on uncompressed data

The basic goal of earthpig is to provide a single tool that will allow
the transfer of arbitrary data and software around the network while
providing a very high level of confidence that the data arrived correctly.

Tracy Tims
mail to ihnp4!watmath!unit36!tracy