joe@petsd.UUCP (Joe Orost) (01/04/85)
<> EXTENDED ABSTRACT Compresses the specified files or standard input. Each file is replaced by a file with the extension .Z, but only if the file got smaller. If no files are specified, the compression is applied to the standard input and is written to standard output regardless of the results. Compressed files can be restored to their original form by specifying the -d option, or by running uncompress (linked to compress), on the .Z files or the standard input. When file names are given, the ownership (if run by root), modes, accessed and modified times are maintained between the file and its .Z version. In this respect, compress can be used for archival purposes, yet can still be used with make(1) after uncompression. Compress uses the modified Lempel-Ziv algorithm described in "A Technique for High Performance Data Compression", Terry A. Welch, IEEE Computer Vol 17, No 6 (June 1984), pp 8-19. Common substrings in the file are first replaced by 9-bit codes 257 and up. When code 512 is reached, the algorithm switches to 10-bit codes and continues to use more bits until the bits limit as specified by the -b flag is reached (default 16). Bits must be between 9 and 16. The default can be changed in the source to allow compress to be run on a smaller machine. After the bits limit is reached, compress periodically checks the compression ratio. If it is increasing, compress continues to use the codes that were previously found in the file. However, if the compression ratio decreases, compress discards the table of substrings and rebuilds it from scratch. This allows the algorithm to adapt to the next "block" of the file. The amount of compression obtained depends on the size of the input file, the amount of bits per code, and the distribution of character substrings. Typically, text files, such as C programs, are reduced by 50-60%. Compression is generally much better than that achieved by Huffman coding (as used in pack), or adaptive Huffman coding (compact), and takes less time to compute. Some practical uses for compress are: Saving disk space Lowering uucp phone expenses Saving space in archive tape storage regards, joe -- Full-Name: Joseph M. Orost UUCP: ..!{decvax,ucbvax,ihnp4}!vax135!petsd!joe US Mail: MS 313; Perkin-Elmer; 106 Apple St; Tinton Falls, NJ 07724 Phone: (201) 870-5844 Location: 40 19'49" N / 74 04'37" W