bet@ecsvax.UUCP (Bennett E. Todd III) (07/07/86)
The responses were overwhelming; everybody wanted this. Herewith: the documentation and makefile in this posting, and C sources in a second, to the port of compress(1) to MS-DOS under MSC 3.0. This version should port to other micros more easily than the original, I would think. Executables for this can be found in net.micro.pc. Read the C source for the credits; I had nothing to do with the development of this excellent piece of software, and credit for this work shouldn't be lost. #!/bin/sh # Cut above the preceeding line, or cut here if you must. # This is a shar archive. Extract with sh, not csh. # The rest of this file will extract: # makefile compress.doc sed 's/^X//' > makefile << '/*EOF' Xcomsmlpc.exe: compress.c X msc compress,compress/FPi/Ze/Gs/Ot; X link compress,comsmlpc/NOI; X Xcombigpc.exe: compress.c X msc compress,compress/DBIG/FPi/Ze/Gs/Ot; X link compress,combigpc.big/NOI; X exepack combigpc.big combigpc.exe X Xcomsmlat.exe: compress.c X msc compress,compress/FPi/Ze/Gs/Ot/G2; X link compress,comsmlat/NOI; X X Xcombigat.exe: compress.c X msc compress,compress/DBIG/FPi/Ze/Gs/Ot/G2; X link compress,combigat.big/NOI; X exepack combigat.big combigat.exe /*EOF ls -l makefile sed 's/^X//' > compress.doc << '/*EOF' X COMPRESS(1) MS-DOS Programmer's Manual COMPRESS(1) X X NAME X compress, uncompress, zcat - compress or expand data X X SYNOPSIS X compress [-cdfivV] [-b bits] [name ...] X uncompress [-cfivV] [name ...] X zcat [-iV] [name ...] X X DESCRIPTION X Compress reduces the size of the named files using X adaptive Lempel-Ziv coding. Whenever possible, each X file is replaced by one with the extension .Z or XZ, X while keeping the same modification times. If no X files are specified, the standard input is X compressed to the standard output. Compressed files X can be restored to their original form using X uncompress or zcat. X X The -c option makes compress/uncompress write to the X standard output; no files are changed. The X nondestructive behavior of zcat is identical to that X of uncompress -c. X X The -d (decompress) option makes compress restore X its input files to their normal form. Uncompress is X identical to compress with the -d option specified. X X The -f option will force compression of "name". This X is useful for compressing an entire directory, even X if some of the files do not actually shrink. If -f X is not given, the user is prompted as to whether an X existing file should be overwritten. X X The -i (image mode) option suppresses the X transformation of text lines from MS-DOS (CR-LF X delimited) form to UNIX (LF delimited) form during X compression, and suppresses the reverse X transformation during decompression. X X The -v (verbose) option causes a message to be X printed, yielding the percentage of reduction for X each file compressed. X X 1 X COMPRESS(1) MS-DOS Programmer's Manual COMPRESS(1) X X X The -V option causes the current version and compile X options to be printed on stderr. X X Compress uses the modified Lempel-Ziv algorithm X popularized in "A Technique for High Performance X Data Compression", Terry A. Welch, IEEE Computer, X vol. 17, no. 6 (June 1984), pp. 8-19. Common X substrings in the file are first replaced by 9-bit X codes 257 and up. When code 512 is reached, the X algorithm switches to 10-bit codes and continues to X use more bits until the limit specified by the -b X flag is reached (default is the maximum for which X the program was built). X X "Bits" must be between 9 and the lesser of 16, and X the limit imposed at compile-time. The MS-DOS X version of compress comes in two sizes. One has a X 12-bit limit, and will run in a machine with 128K X bytes of available user memory. The other has a X 16-bit limit, and requires about 450K bytes to run. X X After the "bits" limit is attained, compress X periodically checks the compression ratio. If it is X increasing, compress continues to use the existing X code dictionary. However, if the compression ratio X decreases, compress discards the table of substrings X and rebuilds it from scratch. This allows the X algorithm to adapt to the next "block" of the file. X X Note that the -b flag is omitted for uncompress, X since the "bits" parameter specified during X compression is encoded within the output, along with X a magic number to ensure that neither decompression X of random data nor recompression of compressed data X is attempted. X X The amount of compression obtained depends on the X size of the input, the number of "bits" per code, X and the distribution of common substrings. X Typically, text such as source code or English is X reduced by 50-60%. Compression is generally much X X 2 X COMPRESS(1) MS-DOS Programmer's Manual COMPRESS(1) X X better than that achieved by Huffman coding (as used X in SQ), and takes less time to compute. X X Exit status is normally 0; if the last file is X larger after (attempted) compression, the status is X 2; if an error occurs, exit status is 1. X X SEE ALSO X SQ(1) X X DIAGNOSTICS X Usage: compress [-cdfivV] [-b maxbits] [file ...] X Invalid options were specified on the command X line. X Missing maxbits X Maxbits must follow -b. X file: not in compressed format X The file specified to UNCOMPRESS has not been X compressed. X file: compressed with xx bits, can only handle yy bits X "File" was compressed by a program that could X deal with more "bits" than the compress code on X this machine. Recompress the file with smaller X "bits". X file: already has xx suffix -- no change X The file is assumed to be already compressed X because the last two characters of its extension X are ".Z" or "XZ". Rename the file and try X again. X fn: part of filename extension will be replaced by XZ X File name, fn, contains at least two characters X in the "extension" field. The second and third X will be replaced by "XZ" in the compressed X file's name. X fn already exists; do you wish to overwrite fn? X Respond "y" if you want the output file, fn, to X be replaced; "n" if not. X Compression: xx.xx% X Percentage of the input saved by compression. X (Relevant only for -v.) X -- file unchanged X No savings is achieved by compression. The X X 3 X COMPRESS(1) MS-DOS Programmer's Manual COMPRESS(1) X X input remains virgin. X X BUGS X Although compressed files are compatible between X machines with large memory, -b12 should be used for X file transfer to architectures with a small process X data space (64KB or less, as exhibited by the DEC X PDP series, or the small MS-DOS version, for X example). X X MS-DOS version 2 does not permit a program to X determine the name used to call it. As a result, X the aliases, uncompress and zcat, cannot be used. X They can be used under MS-DOS version 3, though the X actual file name for uncompress will be X "uncompre.exe". X X MS-DOS does not support UNIX-style file links. As a X result, even though compress, uncompress and zcat X are all the same program, it (they) will have to be X stored three times, once under each of the three X names, in order to use them under MS-DOS version 3. X As explained in the previous paragraph, this is not X an option under MS-DOS version 2. X X X X X X X X X X X X X X X X X X X X 4 X /*EOF ls -l compress.doc exit -- Bennett Todd -- Duke Computation Center, Durham, NC 27706-7756; (919) 684-3695 UUCP: ...{decvax,seismo,philabs,ihnp4,akgua}!mcnc!ecsvax!duccpc!bet