@lsuc.uucp (02/04/88)
I've been trying to get some utilities written or ported lately
to complete our basic set of telecom tools. Lempel-Ziv-Welch
'compress' is the last of the major ones for now. I've been
working on this port for a few days now. It's essentially James
Jone's port reworked for 6809. In doing so, I had some help from
Peter Dibble getting it to compile. Now, I'm at the final debugging
stage and I'm stumpped. The current file will 'compress' my
test file into a 12 bit file that my 68K version of 'compress' will
expand properly, however, then x@|it crashes. I have increased
the data space by 40 pages (10K) and it still crashes (it's a
stack overflow error). I could expand the data space further,
but it's very unusual for this much extension of data space to
be necessary, so I expect there are other problems. Any advice
would be appreciated.
Cheers! -- Jim O.
# This is a shell archive.
# Remove everything above and including the cut line.
# Then run the rest of the file through sh.
#----cut here-----cut here-----cut here-----cut here----#
#!/bin/sh
# shar: Shell Archiver
# Run the following text with /bin/sh to create:
# docfile
# makefile
# compress.c
# make.gsgfd
# gs_gfd.a
# cpr.decomp.c
# compress.h
# This archive created:
# By: Jim Omura ()
cat << \SHAR_EOF > docfile
/*
* compress.c - File compression ala IEEE Computer, June 1984.
*
* Authors: Spencer W. Thomas (decvax!harpo!utah-cs!utah-gr!thomas)
* Jim McKie (decvax!mcvax!jim)
* Steve Davies (decvax!vax135!petsd!peora!srd)
* Ken Turkowski (decvax!decwrl!turtlevax!ken)
* James A. Woods (decvax!ihnp4!ames!jaw)
* Joe Orost (decvax!vax135!petsd!joe)
*
* $Header: compress.c,v 4.0 85/07/30 12:50:00 joe Release $
* $Log: compress.c,v $
* Revision 4.0 85/07/30 12:50:00 joe
* Removed ferror() calls in output routine on every output except first.
* Prepared for release to the world.
*
* Revision 3.6 85/07/04 01:22:21 joe
* Remove much wasted storage by overlaying hash table with the tables
* used by decompress: tab_suffix[1<<BITS], stack[8000]. Updated USERMEM
* computations. Fixed DumpTab() DEBUG routine.
*
* Revision 3.5 85/06/30 20:47:21 jaw
* Change hash function to use exclusive-or. Rip out hash cache. These
* speedups render the megamemory version defunct, for now. Make decoder
* stack global. Parts of the RCS trunks 2.7, 2.6, and 2.1 no longer apply.
*
* Revision 3.4 85/06/27 12:00:00 ken
* Get rid of all floating-point calculations by doing all compression ratio
* calculations in fixed point.
*
* Revision 3.3 85/06/24 21:53:24 joe
* Incorporate portability suggestion for M_XENIX. Got rid of text on #else
* and #endif lines. Cleaned up #ifdefs for vax and interdata.
*
* Revision 3.2 85/06/06 21:53:24 jaw
* Incorporate portability suggestions for Z8000, IBM PC/XT from mailing list.
* Default to "quiet" output (no compression statistics).
*
* Revision 3.1 85/05/12 18:56:13 jaw
* Integrate decompress() stack speedups (from early pointer mods by McKie).
* Repair multi-file USERMEM gaffe. Unify 'force' flags to mimic semantics
* of SVR2 'pack'. Streamline block-compress table clear logic. Increase
* output byte count by magic number size.
*
* Revision 3.0 84/11/27 11:50:00 petsd!joe
* Set HSIZE depending on BITS. Set BITS depending on USERMEM. Unrolled
* loops in clear routines. Added "-C" flag for 2.0 compatibility. Used
* unsigned compares on Perkin-Elmer. Fixed foreground check.
*
* Revision 2.7 84/11/16 19:35:39 ames!jaw
* Cache common hash codes based on input statistics; this improves
* performance for low-density raster images. Pass on #ifdef bundle
* from Turkowski.
*
* Revision 2.6 84/11/05 19:18:21 ames!jaw
* Vary size of hash tables to reduce time for small files.
* Tune PDP-11 hash function.
*
* Revision 2.5 84/10/30 20:15:14 ames!jaw
* Junk chaining; replace with the simpler (and, on the VAX, faster)
* double hashing, discussed within. Make block compression standard.
*
* Revision 2.4 84/10/16 11:11:11 ames!jaw
* Introduce adaptive reset for block compression, to boost the rate
* another several percent. (See mailing list notes.)
*
* Revision 2.3 84/09/22 22:00:00 petsd!joe
* Implemented "-B" block compress. Implemented REVERSE sorting of tab_next.
* Bug fix for last bits. Changed fwrite to putchar loop everywhere.
*
* Revision 2.2 84/09/18 14:12:21 ames!jaw
* Fold in news changes, small machine typedef from thomas,
* #ifdef interdata from joe.
*
* Revision 2.1 84/09/10 12:34:56 ames!jaw
* Configured fast table lookup for 32-bit machines.
* This cuts user time in half for b <= FBITS, and is useful for news batching
* from VAX to PDP sites. Also sped up decompress() [fwrite->putc] and
* added signal catcher [plus beef in WriteErr()] to delete effluvia.
*
* Revision 2.0 84/08/28 22:00:00 petsd!joe
* Add check for foreground before prompting user. Insert maxbits into
* compressed file. Force file being uncompressed to end with ".Z".
* Added "-c" flag and "zcat". Prepared for release.
*
* Revision 1.10 84/08/24 18:28:00 turtlevax!ken
* Will only compress regular files (no directories), added a magic number
* header (plus an undocumented -n flag to handle old files without headers),
* added -f flag to force overwriting of possibly existing destination file,
* otherwise the user is prompted for a response. Will tack on a .Z to a
* filename if it doesn't have one when decompressing. Will only replace
* file if it was compressed.
*
* Revision 1.9 84/08/16 17:28:00 turtlevax!ken
* Removed scanargs(), getopt(), added .Z extension and unlimited number of
* filenames to compress. Flags may be clustered (-Ddvb12) or separated
* (-D -d -v -b 12), or combination thereof. Modes and other status is
* copied with CopyStat(). -O bug for 4.2 seems to have disappeared with
* 1.8.
*
* Revision 1.8 84/08/09 23:15:00 joe
* Made it compatible with vax version, installed jim's fixes/enhancements
*
* Revision 1.6 84/08/01 22:08:00 joe
* Sped up algorithm significantly by sorting the compress chain.
*
* Revision 1.5 84/07/13 13:11:00 srd
* Added C version of vax asm routines. Changed structure to arrays to
* save much memory. Do unsigned compares where possible (faster on
* Perkin-Elmer)
*
* Revision 1.4 84/07/05 03:11:11 thomas
* Clean up the code a little and lint it. (Lint complains about all
* the regs used in the asm, but I'm not going to "fix" this.)
*
* Revision 1.3 84/07/05 02:06:54 thomas
* Minor fixes.
*
* Revision 1.2 84/07/05 00:27:27 thomas
* Add variable bit length output.
*/
SHAR_EOF
cat << \SHAR_EOF > makefile
* Makefile for compress.c
* 6809 version
CFLAGS = -dOS9 -a
HFILES = compress.h /dd/defs/stdio.h
LFILES = -l=/dd/lib/cgfx.l -l=/dd/lib/clib.l -l=/dd/lib/sys.l
RFILES = compress.r gs_gfd.r cpr.decomp.r
compress: $(RFILES)
rlink /dd/lib/cstart.r $(RFILES) $(LFILES) -M=40 -o=/dd/cmds/compress
compress.r: compress.a
rma compress.a -o=compress.r
compress.a: compress.c $(HFILES)
cc1 compress.c $(CFLAGS)
del c.com
cpr.decomp.r: cpr.decomp.c
cc1 cpr.decomp.c -r -dOS9
del c.com
gs_gfd.r: gs_gfd.a
make -f=make.gsgfd
SHAR_EOF
cat << \SHAR_EOF > compress.c
/*
* Compress - data compression program
*
* huge pile of seething #ifdefs removed by James Jones, seeking to
* (1) make this run at least under OS-9/68000 and (2) insist on 12
* bit maximum code size, since past that much memory is eaten to
* little effect.
*/
#include "compress.h"
int NBits; /* number of bits/code */
int MaxBits = BITS; /* user settable max # bits/code */
CodeInt CurMaxCode; /* maximum code, given NBits */
CodeInt CodeUpb = 1 << BITS; /* should NEVER generate this code */
CountInt htab[HSIZE];
unsigned short codetab[HSIZE];
CodeInt hsize = HSIZE; /* for dynamic table sizing */
CodeInt FreeEnt = 0; /* first unused entry */
int ExitStat = 0;
CodeInt GetCode();
#ifdef DEBUG
int Debug = FALSE;
int Verbose = FALSE;
Usage()
{
fprintf(stderr,"Usage: compress [-dDVfo] [-b MaxBits] [file ...]\n");
}
#else
Usage()
{
fprintf(stderr,"Usage: compress [-dfvoV] [-b MaxBits] [file ...]\n");
}
#endif /* DEBUG */
int UseHeader = TRUE; /* Use a 3-byte header, unless old file */
int ToStdout = FALSE; /* Write output on stdout, suppress messages */
int Quiet = TRUE; /* don't tell me about compression */
CharType HeaderTag[] = {'\037', '\235'}; /* 1F 9D */
/*
* block compression parameters -- after all codes are used up,
* and compression rate changes, start over.
*/
int BlockCompress = BLOCK_MASK;
int Clear = FALSE;
long int ratio = 0;
CountInt Checkpoint = CHECK_GAP;
int Force = FALSE;
int DoCompress = TRUE;
char ofname[100];
char **fileptr;
/*
* TAG( main )
*
* Algorithm from "A Technique for High Performance Data Compression",
* Terry A. Welch, IEEE Computer Vol 17, No 6 (June 1984), pp 8-19.
*
* Usage: compress [-dfvc] [-b bits] [file ...]
* Inputs:
* -d: If given, decompression is done instead.
*
* -c: Write output on stdout, don't remove original.
*
* -b: Parameter limits the max number of bits/code.
*
* -f: Forces output file to be generated, even if one already
* exists, and even if no space is saved by compressing.
* If -f is not used, the user will be prompted if stdin is
* a tty, otherwise, the output file will not be overwritten.
*
* -v: Write compression statistics
*
* file ...: Files to be compressed. If none specified, stdin
* is used.
* Outputs:
* file.Z: Compressed form of file with same mode, owner, and utimes
* or stdout (if stdin used as input)
*
* Assumptions:
* When filenames are given, replaces with the compressed version
* (.Z suffix) only if the file decreases in size.
* Algorithm:
* Modified Lempel-Ziv method (LZW). Basically finds common
* substrings and replaces them with a variable size code. This is
* deterministic, and can be done on the fly. Thus, the decompression
* procedure needs no input table, but tracks the way the table was built.
*/
main(argc, argv)
register int argc;
char **argv;
{
int overwrite; /* Do not overwrite unless given -f flag */
char tempname[100];
char **filelist;
char *cp, *rindex(), *malloc();
extern catcher();
overwrite = FALSE;
intercept(catcher);
filelist = fileptr = (char **) (malloc(argc * sizeof(*argv)));
*filelist = NULL;
/* Argument Processing
* All flags are optional.
* -D => debug
* -V => print Version; debug verbose
* -d => decompress
* -v => unquiet
* -f => force overwrite of output file
* -n => no header: useful to uncompress old files
* -b MaxBits => MaxBits. If -b is specified, then MaxBits MUST be
* given also.
* -c => cat all output to stdout
* -o => generate output compatible with old compress (2.0)
* if a string is left, must be an input filename.
*/
for (argc--, argv++; argc > 0; argc--, argv++) {
if (**argv != '-') {
/* Input file name */
*fileptr++ = *argv;
*fileptr = NULL;
} else {
while (*++(*argv)) { /* Process all flags in this arg */
switch (**argv) {
#ifdef DEBUG
case 'D':
debug = 1;
break;
case 'V':
verbose = 1;
version();
break;
#else
case 'V':
version();
break;
#endif /* DEBUG */
case 'v':
Quiet = FALSE;
break;
case 'd':
DoCompress = FALSE;
break;
case 'f':
case 'F':
overwrite = TRUE;
Force = TRUE;
break;
case 'n':
UseHeader = FALSE;
break;
case 'o':
BlockCompress = 0;
break;
case 'b':
if (!ARGVAL()) {
fprintf(stderr, "Missing MaxBits\n");
Usage();
exit(1);
}
MaxBits = atoi(*argv);
goto nextarg;
case 'c':
ToStdout = TRUE;
break;
case 'q':
Quiet = TRUE;
break;
default:
fprintf(stderr, "Unknown flag: '%c'; ", **argv);
Usage();
exit(1);
}
}
}
nextarg:
continue;
}
if (MaxBits < INIT_BITS)
MaxBits = INIT_BITS;
else if (MaxBits > BITS)
MaxBits = BITS;
CodeUpb = 1 << MaxBits;
hsize = HSIZE;
if (*filelist == NULL)
{
GoForIt(); /* stdin->stdout, no setup required */
if (!Quiet)
putc('\n', stderr);
}
else
{
for (fileptr = filelist; *fileptr; fileptr++)
{
ExitStat = 0;
if (DoCompress)
{
/* COMPRESSION */
if (strcmp(*fileptr + strlen(*fileptr) - 2, ".Z") == 0)
{
fprintf(stderr, "%s: already has .Z suffix -- no change\n",
*fileptr);
continue;
}
/* Generate output filename */
strcpy(ofname, *fileptr);
if ((cp = rindex(ofname, '/')) != NULL)
{
cp++;
}
else
{
cp = ofname;
}
if (strlen(cp) > MAXNAME - 2)
{
fprintf(stderr, "%s: filename too long to tack on .Z\n",
cp);
continue;
}
strcat(ofname, ".Z");
}
else
{
/* DECOMPRESSION */
/* Check for .Z suffix */
if (strcmp(*fileptr + strlen(*fileptr) - 2, ".Z") != 0)
{
/* No .Z: tack one on */
strcpy(tempname, *fileptr);
strcat(tempname, ".Z");
*fileptr = tempname;
}
/* Generate output filename */
strcpy(ofname, *fileptr);
ofname[strlen(*fileptr) - 2] = '\0'; /* Strip off .Z */
}
/* Open input file */
if ((freopen(*fileptr, "r", stdin)) == NULL)
{
perror("opening", *fileptr);
continue;
}
if (!ToStdout)
{
/* Check for overwrite of existing file */
if (!overwrite && access(ofname, 0) == 0)
{
fprintf(stderr, "%s already exists\n", ofname);
continue;
}
/* Open output file */
if (freopen(ofname, "w", stdout) == NULL)
{
perror("opening", ofname);
continue;
}
if (!Quiet)
{
fprintf(stderr, "%s: ", *fileptr);
}
}
/* Actually do the compression/decompression */
if (GoForIt() && !ToStdout)
{
CopyStat(*fileptr, ofname);
}
if (ExitStat == 1 || !Quiet)
{
putc('\n', stderr);
}
}
}
exit(ExitStat);
}
GoForIt()
{
register int result;
if (DoCompress)
{
result = compress();
}
#ifdef DEBUG
else if (debug) {
printcodes();
result = FALSE;
}
#endif
else
result = decompress();
#ifdef DEBUG
if (verbose)
DumpTab();
#endif
return result;
}
static int offset;
long int InCount = 1; /* length of input */
long int BytesOut; /* length of compressed output */
long int OutCount = 0; /* # of codes output (for debugging) */
/*
* compress stdin to stdout
*
* Algorithm: use open addressing double hashing (no chaining) on the
* prefix code / next character combination. We do a variant of Knuth's
* algorithm D (vol. 3, sec. 6.4) along with G. Knott's relatively-prime
* secondary probe. Here, the modular division first probe is gives way
* to a faster exclusive-or manipulation. Also do block compression with
* an adaptive reset, whereby the code table is cleared when the compression
* ratio decreases, but after the table fills. The variable-length output
* codes are re-sized at this point, and a special CLEAR code is generated
* for the decompressor. Late addition: construct the table according to
* file size for noticeable speed improvement on small files. Please direct
* questions about this implementation to ames!jaw.
*/
compress()
{
register long fcode;
register CodeInt i = 0;
register int c, disp, hshift;
register CodeInt ent;
if (UseHeader) {
putchar(HeaderTag[0]);
putchar(HeaderTag[1]);
putchar((char)(MaxBits | BlockCompress));
if (ferror(stdout))
WriteErr();
}
offset = 0;
BytesOut = 3; /* includes 3-byte header mojo */
OutCount = 0;
Clear = FALSE;
ratio = 0;
InCount = 1;
Checkpoint = CHECK_GAP;
CurMaxCode = MaxCode(NBits = INIT_BITS);
FreeEnt = ((BlockCompress) ? FIRST : 256);
ent = getchar();
hshift = 0;
for (fcode = (long) hsize; fcode < 65536L; fcode += fcode)
hshift++;
hshift = 8 - hshift; /* set hash code range bound */
ClearHashTab((CountInt) hsize); /* clear hash table */
while ((c = getchar()) != EOF) {
InCount++;
fcode = (long) (((long) c << MaxBits) + ent);
i = ((c << hshift) ^ ent); /* xor hashing */
if (i == 0)
disp = 1;
else
disp = hsize - i;
for (;;) {
if (HTab(i) == fcode) {
ent = CodeTab(i);
break;
}
if ((long) HTab(i) < 0) { /* empty slot */
output ((CodeInt) ent);
OutCount++;
ent = c;
if (FreeEnt < CodeUpb) {
CodeTab(i) = FreeEnt++; /* code -> hashtable */
HTab(i) = fcode;
} else if ((CountInt) InCount >= Checkpoint && BlockCompress)
ClearBlock();
break;
}
if ((i -= disp) < 0)
i += hsize;
}
}
/* Put out the final code. */
output((CodeInt) ent);
OutCount++;
output((CodeInt) - 1);
/* Print out stats on stderr */
if (!ToStdout && !Quiet) {
#ifdef DEBUG
fprintf(stderr,
"%ld chars in, %ld codes (%ld bytes) out, compression factor: ",
InCount, OutCount, BytesOut);
prratio(stderr, InCount, BytesOut);
fprintf(stderr, "\n");
fprintf(stderr, "\tCompression as in compact: ");
prratio(stderr, InCount - BytesOut, InCount);
fprintf(stderr, "\n");
fprintf(stderr, "\tLargest code (of last block) was %d (%d bits)\n",
FreeEnt - 1, NBits);
#else /* !DEBUG */
fprintf(stderr, "Compression: ");
prratio(stderr, InCount - BytesOut, InCount);
#endif /* DEBUG */
}
if (BytesOut > InCount) /* exit(2) if no savings */
ExitStat = 2;
}
/*
* TAG( output )
*
* Output the given code.
* Inputs:
* code: A NBits-bit integer. If == -1, then EOF. This assumes
* that NBits =< (long)wordsize - 1.
* Outputs:
* Outputs code to the file.
* Assumptions:
* Chars are 8 bits long.
* Algorithm:
* Maintain a BITS character long buffer (so that 8 codes will
* fit in it exactly). Use the VAX insv instruction to insert each
* code in turn. When the buffer fills up empty it and start over.
*/
static char buf[BITS];
CharType lmask[9] = {
0xff, 0xfe, 0xfc, 0xf8, 0xf0, 0xe0, 0xc0, 0x80, 0x00
};
CharType rmask[9] = {
0x00, 0x01, 0x03, 0x07, 0x0f, 0x1f, 0x3f, 0x7f, 0xff
};
output(code)
CodeInt code;
{
#ifdef DEBUG
static int col = 0;
#endif /* DEBUG */
/*
* On the VAX, it is important to have the register declarations
* in exactly the order given, or the asm will break.
*/
register int r_off = offset, bits = NBits;
register char *bp = buf;
#ifdef DEBUG
if (verbose)
fprintf(stderr, "%5d%c", code,
(col += 6) >= 74 ? (col = 0, '\n') : ' ');
#endif /* DEBUG */
if (code >= 0) {
/* byte/bit numbering on the VAX is simulated by the following code */
/* Get to the first byte. */
bp += (r_off >> 3);
r_off &= 7;
/*
* Since code is always >= 8 bits, only need to mask the first
* hunk on the left.
*/
*bp = (*bp & rmask[r_off]) | (code << r_off) & lmask[r_off];
bp++;
bits -= (8 - r_off);
code >>= 8 - r_off;
/* Get any 8 bit parts in the middle (<=1 for up to 16 bits). */
if (bits >= 8) {
*bp++ = code;
code >>= 8;
bits -= 8;
}
/* Last bits. */
if (bits)
*bp = code;
offset += NBits;
if (offset == (NBits << 3)) {
bp = buf;
bits = NBits;
BytesOut += bits;
do {
putchar(*bp++);
} while (--bits);
offset = 0;
}
/*
* If the next entry is going to be too big for the code size,
* then increase it, if possible.
*/
if (FreeEnt > CurMaxCode || Clear) {
/*
* Write the whole buffer, because the input side won't
* discover the size increase until after it has read it.
*/
if (offset > 0) {
if (fwrite(buf, 1, NBits, stdout) != NBits)
WriteErr();
BytesOut += NBits;
}
offset = 0;
if (Clear) {
CurMaxCode = MaxCode(NBits = INIT_BITS);
Clear = FALSE;
} else {
NBits++;
if (NBits == MaxBits)
CurMaxCode = CodeUpb;
else
CurMaxCode = MaxCode(NBits);
}
#ifdef DEBUG
if (debug) {
fprintf(stderr, "\nChange to %d bits\n", NBits);
col = 0;
}
#endif /* DEBUG */
}
} else {
/* At EOF, write the rest of the buffer. */
if (offset > 0)
fwrite(buf, 1, (offset + 7) / 8, stdout);
BytesOut += (offset + 7) / 8;
offset = 0;
fflush(stdout);
#ifdef DEBUG
if (verbose)
fprintf(stderr, "\n");
#endif /* DEBUG */
if (ferror(stdout))
WriteErr();
}
}
/*
* TAG( GetCode )
*
* Read one code from the standard input. If EOF, return -1.
* Inputs:
* stdin
* Outputs:
* code or -1 is returned.
*/
CodeInt
GetCode()
{
register CodeInt code;
static int offset = 0, size = 0;
static CharType buf[BITS];
register int r_off, bits;
register CharType *bp = buf;
if (Clear || offset >= size || FreeEnt > CurMaxCode) {
/*
* If the next entry will be too big for the current code
* size, then we must increase the size. This implies reading
* a new buffer full, too.
*/
if (FreeEnt > CurMaxCode) {
NBits++;
if (NBits == MaxBits)
CurMaxCode = CodeUpb; /* won't get any bigger now */
else
CurMaxCode = MaxCode(NBits);
}
if (Clear) {
CurMaxCode = MaxCode(NBits = INIT_BITS);
Clear = FALSE;
}
size = fread(buf, 1, NBits, stdin);
if (size <= 0)
return -1; /* end of file */
offset = 0;
/* Round size down to integral number of codes */
size = (size << 3) - (NBits - 1);
}
r_off = offset;
bits = NBits;
/* Get to the first byte. */
bp += (r_off >> 3);
r_off &= 7;
/* Get first part (low order bits) */
#ifdef NO_UCHAR
code = ((*bp++ >> r_off) & rmask[8 - r_off]) & 0xff;
#else
code = (*bp++ >> r_off);
#endif /* NO_UCHAR */
r_off = 8 - r_off; /* now, offset into code word */
bits -= r_off;
/* Get any 8 bit parts in the middle (<=1 for up to 16 bits). */
if (bits >= 8) {
#ifdef NO_UCHAR
code |= (*bp++ & 0xff) << r_off;
#else
code |= *bp++ << r_off;
#endif /* NO_UCHAR */
r_off += 8;
bits -= 8;
}
/* high order bits. */
code |= (*bp & rmask[bits]) << r_off;
offset += NBits;
return code;
}
#ifdef DEBUG
printcodes()
{
/* Just print out codes from input file. */
CodeInt code;
int col = 0, bits;
bits = NBits = INIT_BITS;
CurMaxCode = MaxCode(NBits);
FreeEnt = ((BlockCompress) ? FIRST : 256);
while ((code = GetCode()) >= 0) {
if ((code == CLEAR) && BlockCompress) {
FreeEnt = FIRST - 1;
Clear = TRUE;
} else if (FreeEnt < CodeUpb)
FreeEnt++;
if (bits != NBits) {
fprintf(stderr, "\nChange to %d bits\n", NBits );
bits = NBits;
col = 0;
}
fprintf(stderr, "%5d%c", code,
(col += 6) >= 74 ? (col = 0, '\n') : ' ' );
}
putc('\n', stderr);
exit(0);
}
CodeInt SortTab[1 << BITS]; /* sorted pointers into htab */
#define STACK_SIZE 15000
DumpTab() /* dump string table */
{
register int i, first, ent, c;
int StackTop = STACK_SIZE;
if (DoCompress) {
/* compressing */
register int flag = 1;
for (i = 0; i < hsize; i++) { /* build sort pointers */
if ((long) HTab(i) >= 0)
SortTab[CodeTab(i)] = i;
}
first = BlockCompress ? FIRST : 256;
for (i = first; i < FreeEnt; i++) {
fprintf(stderr, "%5d: \"", i);
OutputStack[--StackTop] = '\n';
OutputStack[--StackTop] = '"';
StackTop = InStack((HTab(SortTab[i]) >> MaxBits) & 0xff,
StackTop);
for (ent = HTab(SortTab[i]) & ((1 << MaxBits) - 1);
ent > 256;
ent = HTab(SortTab[ent]) & ((1 << MaxBits) - 1))
{
StackTop = InStack(HTab(SortTab[ent]) >> MaxBits, StackTop);
}
StackTop = InStack(ent, StackTop);
fwrite(&OutputStack[StackTop], 1, STACK_SIZE - StackTop, stderr);
StackTop = STACK_SIZE;
}
} else if (!debug) { /* decompressing */
for (i = 0; i < FreeEnt; i++) {
ent = i;
c = TabSuffix(ent);
if (isascii(c) && isprint(c))
fprintf(stderr, "%5d: %5d/'%c' \"",
ent, TabPrefix(ent), c);
else
fprintf(stderr, "%5d: %5d/\\%03o \"",
ent, TabPrefix(ent), c);
OutputStack[--StackTop] = '\n';
OutputStack[--StackTop] = '"';
for (;;) {
if (ent == NULL)
break;
StackTop = InStack(TabSuffix(ent), StackTop);
if (ent < FIRST)
break;
ent = TabPrefix(ent);
}
fwrite(&OutputStack[StackTop], 1, STACK_SIZE - StackTop, stderr);
StackTop = STACK_SIZE;
}
}
}
int
InStack(c, StackTop)
register int c, StackTop;
{
if ((isascii(c) && isprint(c) && c != '\\') || c == ' ') {
OutputStack[--StackTop] = c;
} else {
switch(c) {
case '\n':
OutputStack[--StackTop] = 'n';
break;
case '\t':
OutputStack[--StackTop] = 't';
break;
case '\b':
OutputStack[--StackTop] = 'b';
break;
case '\f':
OutputStack[--StackTop] = 'f';
break;
case '\l':
OutputStack[--StackTop] = 'l';
break;
case '\\':
OutputStack[--StackTop] = '\\';
break;
default:
OutputStack[--StackTop] = '0' + c % 8;
OutputStack[--StackTop] = '0' + (c / 8) % 8;
OutputStack[--StackTop] = '0' + c / 64;
break;
}
OutputStack[--StackTop] = '\\';
}
return StackTop;
}
#endif /* DEBUG */
WriteErr()
{
if (ofname != NULL) {
perror("writing", ofname);
unlink(ofname);
}
exit(1);
}
#define FD_SIZE sizeof(struct fildes)
CopyStat(ifname, ofname)
register char *ifname, *ofname;
{
struct fildes instat, outstat;
register int status;
int _gs_gfd(), _ss_pfd();
status = _gs_gfd(fileno(stdin), &instat, FD_SIZE);
fclose(stdin);
if (status == -1) {
fprintf(stderr, "%s -- can't getstat: unchanged",
Quiet ? ifname : "");
ExitStat = 1;
} else if (_gs_gfd(fileno(stdout), &outstat, FD_SIZE) == -1) {
fprintf(stderr, "%s -- can't getstat on output file: unchanged",
Quiet ? ifname : "");
ExitStat = 1;
} else if (ExitStat == 2 && !Force) {
/* No compression: remove file.Z */
if (!Quiet)
fprintf(stderr, " -- file unchanged");
} else {
/* ***** Successful Compression ***** */
ExitStat = 0;
#ifdef OSK
/* OS-9/68000 won't let you set attributes in SS_FD... */
if (_ss_attr(fileno(stdout), instat.fd_att) == -1)
perror("setting attributes", ofname);
#endif
_strass(outstat.fd_own, instat.fd_own, sizeof(outstat.fd_own));
_strass(outstat.fd_date, instat.fd_date, sizeof(outstat.fd_date));
if (_ss_pfd(fileno(stdout), &outstat) != -1) {
fclose(stdout);
/* Remove input file */
if (unlink(ifname))
perror("unlinking", ifname);
if (!Quiet)
fprintf(stderr, " -- replaced with %s", ofname);
return; /* Successful return */
}
}
/* Unsuccessful return -- one of the tests failed */
fclose(stdout);
if (unlink(ofname))
perror("unlinking", ofname);
}
catcher(signal)
int signal;
{
if (ofname != NULL) {
fclose(stdout);
unlink(ofname);
}
exit(signal);
}
ClearBlock() /* table clear for block compress */
{
register long int rat;
Checkpoint = InCount + CHECK_GAP;
#ifdef DEBUG
if (debug) {
fprintf (stderr, "count: %ld, ratio: ", InCount);
prratio (stderr, InCount, BytesOut);
fprintf (stderr, "\n");
}
#endif /* DEBUG */
if (InCount <= 0x007fffff)
rat = (InCount << 8) / BytesOut; /* 8 fractional bits */
else if ((rat = BytesOut >> 8) == 0)
rat = 0x7fffffff; /* Don't divide by zero */
else
rat = InCount / rat;
if (rat > ratio)
ratio = rat;
else {
ratio = 0;
#ifdef DEBUG
if (verbose)
DumpTab(); /* dump string table */
#endif
ClearHashTab((CountInt) hsize);
FreeEnt = FIRST;
Clear = TRUE;
output((CodeInt) CLEAR);
#ifdef DEBUG
if (debug)
fprintf(stderr, "clear\n");
#endif
}
}
ClearHashTab(hsize) /* reset code table */
register CountInt hsize;
{
register CountInt *HTabScan = htab + hsize;
register long i, m1 = -1;
i = hsize - 16;
do { /* might use Sys V memset(3) here */
*(HTabScan - 16) = m1;
*(HTabScan - 15) = m1;
*(HTabScan - 14) = m1;
*(HTabScan - 13) = m1;
*(HTabScan - 12) = m1;
*(HTabScan - 11) = m1;
*(HTabScan - 10) = m1;
*(HTabScan - 9) = m1;
*(HTabScan - 8) = m1;
*(HTabScan - 7) = m1;
*(HTabScan - 6) = m1;
*(HTabScan - 5) = m1;
*(HTabScan - 4) = m1;
*(HTabScan - 3) = m1;
*(HTabScan - 2) = m1;
*(HTabScan - 1) = m1;
HTabScan -= 16;
} while ((i -= 16) >= 0);
for (i += 16; i > 0; i--)
*--HTabScan = m1;
}
prratio(stream, num, den)
FILE *stream;
long int num, den;
{
register int q; /* Doesn't need to be long */
if (num > 214748L) /* 2147483647/10000 */
q = num / (den / 10000L);
else
q = 10000L * num / den; /* Long calculations, though */
if (q < 0) {
putc('-', stream);
q = -q;
}
fprintf(stream, "%d.%02d%%", q / 100, q % 100);
}
version()
{
fputs("OS-9 Lempel-Ziv compress\n", stderr);
fputs("Options: ", stderr);
#ifdef NO_UCHAR
fputs("NO_UCHAR, ", stderr);
#endif
#ifdef DEBUG
fputs("DEBUG, ", stderr);
#endif
fprintf(stderr, "BITS = %d\n", BITS);
}
perror(action, fname)
register char *action, *fname;
{
fprintf(stderr, "I/O error: %s file %s, error %d\n", action, fname, errno);
}
SHAR_EOF
cat << \SHAR_EOF > make.gsgfd
* Makefile for gs_gfd.c
AFILES = /dd/defs/os9defs.a
gs_gfd.r: gs_gfd.a $(AFILES)
rma gs_gfd.a -o=gs_gfd.r
* gs_gfd.a: gs_gfd.c
* cc1 gs_gfd.c -ac
* del c.com
SHAR_EOF
cat << \SHAR_EOF > gs_gfd.a
psect gs_gfd_c,0,0,0,0,0
nam gs_gfd_c
use /dd/defs/os9defs.a
* */* gs_gfd.c */
* *
* */* Impliments _gs_gfd() as per OS-9 68K
* * path = an open RBF file
* * buffer = target to copy file descriptor
* * count = max. bytes to copy (256)
* */
* *
* *#include <direct.h>
* *struct dirent {
* * char dir_name[29];
* * char dir_addr[3];
* * };
* *
* *struct fildes {
* * char fd_att;
* * unsigned fd_own;
* * char fd_date[5];
* * char fd_link;
* * long fd_fsize;
* * char fd_dcr[3];
* * struct {
* * char addr[3];
* * unsigned size;
* * } fdseg[48];
* * };
* *
* *struct ddsect {
* * char dd_tot[3];
* * char dd_tsk;
* * unsigned dd_map;
* * unsigned dd_bit;
* * char dd_dir[3];
* * unsigned dd_own;
* * char dd_att;
* * unsigned dd_dsk;
* * char dd_fmt;
* * unsigned dd_spt;
* * unsigned dd_res;
* * char dd_bt[3];
* * unsigned dd_bsz;
* * char dd_date[5];
* * char dd_name[32];
* * };
* *
* *int _gs_gfd(path,buffer,count)
* *
* *int path;
* *struct fildes *buffer;
* *int count;
* *
* *{
ttl _gs_gfd
_gs_gfd:
pshs u
ldd #_1
lbsr _stkcheck
* * int errflag;
* *
* */* Test variables for locations: */
* * path = path;
leas -2,s
ldd 6,s
std 6,s
* * buffer = buffer;
ldd 8,s
std 8,s
* * count = count;
ldd 10,s
std 10,s
* *
* * /* Call I$GetStt? */
lda 7,s /* Actually ldd 6,s, lsla 8 */
ldx 0
ldu 0
os9 I$Seek /* Position to start of RBF file */
bcs errexit
ldx 8,s /* buffer */
ldy 10,s /* count */
os9 I$Read
bcs errexit
ldd 0
bra cleanex
errexit ldd -1
std 0,s /* Not sure if this is necessary */
* *
* * return(errflag);
* ldd 0,s
cleanex leas 2,s
puls u,pc
* *}
_1 equ -66
* *
* */* End of gs_gfd.c */
endsect
SHAR_EOF
cat << \SHAR_EOF > cpr.decomp.c
/* cpr.decomp.c */
#include "compress.h"
/*
* Decompress stdin to stdout. This routine adapts to the codes in the
* file building the "string" table on-the-fly; requiring no table to
* be stored in the compressed file. The tables used herein are shared
* with those of the compress() routine. See the definitions above.
*/
decompress()
{
extern int BlockCompress;
extern int Clear;
extern unsigned short codetab[HSIZE];
extern CodeInt CodeUpb;
extern CodeInt CurMaxCode;
extern CodeInt FreeEnt;
extern CountInt htab[HSIZE];
extern int MaxBits;
extern int NBits;
extern int UseHeader;
extern char **fileptr;
extern CharType HeaderTag[];
register CharType *stackp;
register int finchar;
register CodeInt code, OldCode, InCode;
if (UseHeader)
{
if ((getchar() != (HeaderTag[0] & 0xFF))
|| (getchar() != (HeaderTag[1] & 0xFF)))
{
fprintf(stderr, "%s: not in compressed format\n", *fileptr);
return FALSE;
}
MaxBits = getchar(); /* set -b from file */
BlockCompress = MaxBits & BLOCK_MASK;
MaxBits &= BIT_MASK;
CodeUpb = 1 << MaxBits;
if (MaxBits > BITS) {
fprintf(stderr,
"%s: compressed with %d bits, can only handle %d bits\n",
*fileptr, MaxBits, BITS);
return FALSE;
}
}
/* As above, initialize the first 256 entries in the table. */
CurMaxCode = MaxCode(NBits = INIT_BITS);
for (code = 255; code >= 0; code--) {
TabPrefix(code) = 0;
TabSuffix(code) = (CharType) code;
}
FreeEnt = ((BlockCompress) ? FIRST : 256);
finchar = OldCode = GetCode();
if (OldCode == -1)
return TRUE; /* EOF already--get out of here */
putchar((char) finchar); /* first code must be 8 bits = char */
if (ferror(stdout)) /* Crash if can't write */
WriteErr();
stackp = OutputStack;
while ((code = GetCode()) > -1) {
if ((code == CLEAR) && BlockCompress) {
for (code = 255; code >= 0; code--)
TabPrefix(code) = 0;
Clear = TRUE;
FreeEnt = FIRST - 1;
if ((code = GetCode ()) == -1) /* O, untimely death! */
break;
}
InCode = code;
/* Special case for KwKwK string. */
if (code >= FreeEnt) {
*stackp++ = finchar;
code = OldCode;
}
/* Generate output characters in reverse order */
while (code >= 256) {
*stackp++ = TabSuffix(code);
code = TabPrefix(code);
}
*stackp++ = finchar = TabSuffix(code);
/* And put them out in forward order */
do {
putchar(*--stackp);
} while (stackp > OutputStack);
/* Generate the new entry. */
if ((code = FreeEnt) < CodeUpb) {
TabPrefix(code) = (unsigned short) OldCode;
TabSuffix(code) = finchar;
FreeEnt = code + 1;
}
/* Remember previous code. */
OldCode = InCode;
}
fflush(stdout);
if (ferror(stdout))
WriteErr();
return TRUE;
}
SHAR_EOF
cat << \SHAR_EOF > compress.h
/* compress.h */
/* Compress for OS-9 68000 and 6809
* Main porting work by James Jones
* Further porting to 6809 by Jim Omura
*/
#include <stdio.h>
#include <ctype.h>
#include <signal.h>
#include <direct.h>
#ifdef OS9
# define NO_UCHAR
# define short int
# include <os9.h>
#endif
/* End of compress.h */
#define min(a,b) ((a > b) ? b : a)
#define INIT_BITS 9 /* initial number of bits/code */
#define PBITS 12
#define BITS 12
#define HSIZE 5003 /* 80% occupancy */
/*
* a CodeInt must be able to hold 2**BITS values of type int,
* and also -1
*/
typedef short CodeInt;
typedef long int CountInt;
#ifdef NO_UCHAR
typedef char CharType;
#else
typedef unsigned char CharType;
#endif /* UCHAR */
/*
* defines for third byte of header
* Masks 0x40 and 0x20 are free. I think 0x20 should mean that there is
* a fourth header byte (for expansion).
*/
#define BIT_MASK 0x1f
#define BLOCK_MASK 0x80
#define MAXNAME 28 /* maximum file name length */
#define TRUE 1
#define FALSE 0
#define ARGVAL() (*++(*argv) || (--argc && *++argv))
#define MaxCode(NBits) ((1 << (NBits)) - 1)
#define HTab(i) htab[i]
#define CodeTab(i) codetab[i]
/*
* To save much memory, we overlay the table used by compress() with those
* used by decompress(). The TabPrefix table is the same size and type
* as the codetab. The TabSuffix table needs 2 ** BITS characters. We
* get this from the beginning of htab. The output stack uses the rest
* of htab, and contains characters. There is plenty of room for any
* possible stack (stack used to be 8000 characters).
*/
#define TabPrefix(i) CodeTab(i)
#define TabSuffix(i) ((CharType *) (htab))[i]
#define OutputStack ((CharType *) &TabSuffix(1 << BITS))
#define CHECK_GAP 10000 /* ratio check interval */
/*
* the next two codes should not be changed lightly, as they must not
* lie within the contiguous general code space.
*/
#define FIRST 257 /* first free entry */
#define CLEAR 256 /* table clear output code */
SHAR_EOF
# End of shell archive
exit 0
--
Jim Omura, 2A King George's Drive, Toronto, (416) 652-3880
ihnp4!utzoo!lsuc!jimomura
Byte Information eXchange: jimomurapete@wlbr.EATON.COM (Pete Lyall) (02/05/88)
Jim -
Speaking of tools, that copy of 'SHAR' that came through comp.os.os9 the other
day has caused me some grief - crashes. I altered the makefile a
little bit (I use a 'cc' instead of 'cc1', etc..) and it seemed to
'make' alright. I started to test it by invoking it on its own source
directory... I got about four lines ("#This is a Shell Archive" ...
etc.) before it just hammered the daylight out of the Gimix.
Thoughts??
--
Pete Lyall (OS9 Users Group VP)| DELPHI: OS9UGVP | Eaton Corp.(818)-706-5693
Compuserve: 76703,4230 (OS9 Sysop) OS9 (home): (805)-985-0632 (24hr./1200 baud)
Internet: pete@wlbr.eaton.com UUCP: {ihnp4,scgvax,jplgodo,voder}!wlbr!pete