[comp.os.minix] Comic

nall@sun8.scri.fsu.edu (John Nall) (07/10/90)

This may be of interest to some, so I'll post it.  (We have
a fairly aggressive version of Pnews here, which makes one
feel guilty if one posts something which may not be of general
interest.  It keeps asking questions like "Are you *sure* that
people want to see this?" and "This costs money, you know.  Are
you *sure* you should do it??")

The fairly significant increase in the compression ratio of
Comic is attractive, so I played with it some on our Cray Y-MP
to see how it does there.  It is horrible!  (In terms of speed.
Worked OK, though, with no changes, under Unicos).

The compression is good enough, however, so that it would bear
some further investigation.  Two possibilities arise (which are
not necessarily non-exclusive):  (a) vectorizing the code, and
(b) profiling it and then going to assembly language for those
parts which show up as chewing up all the execution time.

If anyone else is doing the same thing, I would be interested
in their results.


--
John W. Nall		| Supercomputation Computations Research Institute
nall@sun8.scri.fsu.edu  | Florida State University, Tallahassee, FL 32306
     "Los dioses eran computadores.  ?Que otra cosa podian ser??"

kirkenda@eecs.cs.pdx.edu (Steve Kirkendall) (07/11/90)

In article <228@sun13.scri.fsu.edu> nall@sun8.scri.fsu.edu (John Nall) writes:
>
>The fairly significant increase in the compression ratio of
>Comic is attractive, so I played with it some on our Cray Y-MP
>to see how it does there.  It is horrible!  (In terms of speed.
>Worked OK, though, with no changes, under Unicos).
					   ^^^^^^
Damn, I was hoping you had ported Minix to your Cray.   :-) of course.

The biggest bottleneck is the memrchr() function -- which has already been
converted to assembler for the intel crowd.  I wrote a version in 68000
assembler (included below) and this cut the compression time in half. It still
runs like a one-legged cow, but at least now its a HEALTHIER one-legged cow.

The next problem is that comic outputs the compressed data one bit at a time,
via the putbit() function.  It uses a 1024-byte output buffer, so we don't have
a huge system overhead, but still we invest a lot of computing time to the
processing of each individual bit.  There is also a function for outputting
several bits at once, but it just call putbit() repeatedly, in a loop.
I suspect that we could gain something by rewriting putnbits() (or whatever
its called) to do the bit stuffing directly instead of calling putbit().
-------------------------------------------------------------------------------
Steve Kirkendall    kirkenda@cs.pdx.edu    uunet!tektronix!psueea!eecs!kirkenda

echo x - memrchr68.s
sed '/^X/s///' > memrchr68.s << '/'
X	.sect	.text
X	.sect	.rom
X	.sect	.data
X	.sect	.bss
X	.list
X	.sect	.text
X
X!-------------------------------------------------------------------------!
X! Minix-68k version of the memrchr() function, in 68000 assembly language !
X! Written by Steve Kirkendall, 1990 July 10                               !
X!-------------------------------------------------------------------------!
X
XSTART	=	4		! stack offsets of arguments
XSTOP	=	8		!
XCH	=	13		!
X
X_memrchr:			! start function
X	.globl	_memrchr	!
X				!-
X	move.l	START(sp),a0	! a0 is "start"
X	move.l	STOP(sp),a1	! a1 is "stop"
X	move.b	CH(sp),d0	! d0 is "ch"
X				!-
X	move.b	(a1),d1		! d1 holds the value that the sentinal clobbers
X				!-
X	move.b	d0,(a1)		! install the sentinal
X				!-
X	add.w	#1,a0		! compensate for the initial predecrement
X				!-
Xloop:				!
X	cmp.b	-(a0),d0	! unrolled search loop
X	beq	found		!
X	cmp.b	-(a0),d0	!
X	beq	found		!
X	cmp.b	-(a0),d0	!
X	beq	found		!
X	cmp.b	-(a0),d0	!
X	beq	found		!
X	cmp.b	-(a0),d0	!
X	beq	found		!
X	cmp.b	-(a0),d0	!
X	beq	found		!
X	cmp.b	-(a0),d0	!
X	beq	found		!
X	cmp.b	-(a0),d0	!
X	beq	found		!
X	cmp.b	-(a0),d0	!
X	beq	found		!
X	cmp.b	-(a0),d0	!
X	beq	found		!
X	cmp.b	-(a0),d0	!
X	beq	found		!
X	cmp.b	-(a0),d0	!
X	beq	found		!
X	cmp.b	-(a0),d0	!
X	beq	found		!
X	cmp.b	-(a0),d0	!
X	beq	found		!
X	cmp.b	-(a0),d0	!
X	beq	found		!
X	cmp.b	-(a0),d0	!
X	bne	loop		!
X				!-
Xfound:				! we've found a match, or the sentinel
X				!-
X	move.b	d1,(a1)		! remove the sentinel
X				!-
X	cmp.l	a0,a1		! did we hit the sentinel?
X	beq	notfound	!
X				!-
X	move.l	a0,d0		! we found a match, so return a pointer to it
X	rts			!
X				!-
Xnotfound:			! we hit the sentinel, so return NULL
X	clr.l	d0		!
X	rts			!
/
-------------------------------------------------------------------------------
Steve Kirkendall    kirkenda@cs.pdx.edu    uunet!tektronix!psueea!eecs!kirkenda

nfs@cs.Princeton.EDU (Norbert Schlenker) (07/11/90)

In article <228@sun13.scri.fsu.edu> nall@sun8.scri.fsu.edu (John Nall) writes:
>The compression is good enough, however, so that it would bear
>some further investigation.  Two possibilities arise (which are
>not necessarily non-exclusive):  (a) vectorizing the code, and
>(b) profiling it and then going to assembly language for those
>parts which show up as chewing up all the execution time.

Bruce Evans has posted profiling data for 16 bit PC Minix.  I have
similar (but more outrageous) results from some Unix machines here.
The bottleneck is in memrchr(), which takes ~70% of the time used
by comic (this is the C version now, cmemrchr.c).  Since the PC
Minix code already has memrchr() in assembly, it's clear that
another approach is required entirely.

Kent Williams (I think, but I can't find the note now) has suggested
hashing to find strings, a la compress.  I suspect it is the only
chance that comic has.

On a related note, has anyone gotten lharc to compile on a Unix box?
Every one I've tried here fails to compile lhio.c without error, 
apparently because of an incompatibility with <dirent.h>.  Is there
an easy fix?

On a further related note, one can compile lzhuf.c with -DSELFMAIN
to get a (simpleminded) version which does essentially what compress
does, with much better compression.  The C version is about 1/3 as
fast as compress (using PC ACK); a few tweaks to the assembly code
allow it to run only about 50% slower.

Norbert

wayne@csri.toronto.edu (Wayne Hayes) (07/12/90)

In article <946@rossignol.Princeton.EDU> nfs@cs.Princeton.EDU (Norbert Schlenker) writes:
>In article <228@sun13.scri.fsu.edu> nall@sun8.scri.fsu.edu (John Nall) writes:
>On a further related note, one can compile lzhuf.c with -DSELFMAIN
>to get a (simpleminded) version which does essentially what compress
>does, with much better compression.  The C version is about 1/3 as
>fast as compress (using PC ACK); a few tweaks to the assembly code
>allow it to run only about 50% slower.

The LHARC I posted to Minix works also on Sun 4.3.

-- 
"The number of programs that can be done with the HST has always greatly
exceeded the time available for their execution, and this remains true even
with the telescope in its current state." -- HST Science Working Group and
User's Commitee Report, 1990 June 29.
Wayne Hayes	INTERNET: wayne@csri.utoronto.ca	CompuServe: 72401,3525

cgates@netcom.UUCP (Craig Gates) (07/12/90)

In article <946@rossignol.Princeton.EDU>, nfs@cs.Princeton.EDU (Norbert Schlenker) writes:
> 
> On a related note, has anyone gotten lharc to compile on a Unix box?
> Every one I've tried here fails to compile lhio.c without error, 
> apparently because of an incompatibility with <dirent.h>.  Is there
> an easy fix?
>
I compiled lharc using the file makefile.orig with no problem on a
Sun 110 running SUN OS 4.0.3. The other makefile didn't seem to work too
well on my system.  LHARC is none too fast bust it sure beats the pants
off of compress.
       
Craig

 

ast@cs.vu.nl (Andy Tanenbaum) (07/14/90)

In article <228@sun13.scri.fsu.edu> nall@sun8.scri.fsu.edu (John Nall) writes:
>Comic is attractive, so I played with it some on our Cray Y-MP
>to see how it does there.  It is horrible!  (In terms of speed.
>Worked OK, though, with no changes, under Unicos).
>
>The compression is good enough, however, so that it would bear
>some further investigation.  

I just got back from the UK UNIX Users' Group meeting in London where
I was a speaker, along with Ken Thompson, Dennis Ritchie, Brian Kernighan,
Rob Pike, Dave Presotto, Stu Feldman, and quite a few other stellar
personalities.  All of those guys are great speakers, so if you ever
get the chance to hear them at conferences, don't miss it. I definitely
feel hono(u)red to be invited to join in with those folks.

The conference gave a nice example of man being humbled by nature.  The
conference organi{z/s}ers thought it would be nice to have terminals
available with internet access, so attendees could telnet to their home
machines to read mail.  They rented about 20 X-terminals and set them up
in the conference hotel.  British Telecom renegd on its promise to provide
a leased line to Imperial College nearby, so the conference people
got an infrared link and installed it on the roof of the hotel, along
with another one at Imperial College 1 km away.  They tested it the night
before, and it worked perfectly.

However it refused to work in the day time.  Only at night.  It seems that
the heat from the sun striking the hotel's roof causes the air to shimmer,
distorting the infrared signal so badly that transmission is impossible.
The best laid plans of mice and joysticks ...

Anyway, if anyone has good ideas about how to fix comic to make it faster
without hurting the compression too much, I'm all for it.  The net postings
could be considerably reduced if comic were used instead of compress.
Furthermore, the number of disks in the distribution from P-H could also
be sharply reduced, which would greatly reduce the price.  I think 1.5
will cost $169.  A substantial reason for the increase from $116 (1.3
in slipcase form, with manual), is the jump from 10 disks to 17. (As I
mentioned before, there will be a discount for people who send in a P-H
boot disk from 1.1 - 1.3; more about that later when I find out).

I'm going away again tomorrow.  I'll be back July 31.

Andy Tanenbaum (ast@cs.vu.nl)