geof@imagen.UUCP (Geof Cooper) (10/07/87)
Here is one, undebugged, that illustrates the concept. It
also uses the trick that (if you have it) you can use 32-bit
two's complement addition and add all the carries in at the
end (another trick that is sometimes faster is to generate
a 32-bit one's complement sum and then add the top and bottom
halves together to get the 16-bit sum). Some C compilers
won't accept the wierd syntax below; or maybe I should point
out, as you wretch on the floor, that there is at least ONE
c compiler that DOES accept this syntax.
It is trivial to code it for all C compilers -- but
what you really want to do is code the exact intent of the
following into assembly language. That makes it a lot faster
to add the two halves of a 32-bit word.
These tricks don't work for XNS checksums. Our experience is
that this difference alone makes our XNS implementation a little
slower than our TCP implementation on a 68000.
- Geof
checksum(p, n)
unsigned short *p;
short n;
{
short nloop;
short nrem;
unsigned long sum;
sum = 0;
if ( n > 0 ) {
nloop = (n >> 3) + 1;
nrem = n & 7;
switch ( nloop ) {
do {
sum += *p++;
case 7:
sum += *p++;
case 6:
sum += *p++;
case 5:
sum += *p++;
case 4:
sum += *p++;
case 3:
sum += *p++;
case 2:
sum += *p++;
case 1:
sum += *p++;
case 0:
} while ( --nloop > 0 );
}
}
sum = (sum >> 16) + (sum & 0xffff);
sum = (sum >> 16) + (sum & 0xffff);
return ( sum );
}henry@utzoo.UUCP (10/21/87)
> ... Some C compilers won't accept the wierd syntax below; or maybe I > should point out, as you wretch on the floor, that there is at least ONE > c compiler that DOES accept this syntax. This particular piece of ugliness (switch labels inside a loop, for loop unrolling) is known as Duff's Device, and is legitimate C that any correct C compiler is supposed to accept. Henry Spencer @ U of Toronto Zoology {allegra,ihnp4,decvax,pyramid}!utzoo!henry