[comp.lang.c] CRC-16 in C

djones@megatest.UUCP (Dave Jones) (10/31/90)

Recently, in connection with the topic of hashing, Ron Irvine
posted a C algorithm for calculating the CRC-16 number for
a null-terminated string. I decided to grab it and put it
in my utilities library. His routine was based on two sixteen-entry
tables, because, he said, they would fit in cache. Well one
machine I use has a much larger cache than that, and the other
has none, so I decided just for the fun of it to convert the algorithm
to one based on one 256-entry table. I did that, and testing
has verified that it produces the same results.

Now then, for purposes of documentation, and to verify the original
algorithm, (and for my own education I guess), I decided to figure
out what the thing actually does. I borrowed a book, _Computer Networks,
Second Edition_, by Andrew S. Tanenbaum, (Simon & Schuster) and dug
in. I coded up the algorithm they gave, using the example on page
210, fig. 4-7.(*) My code printed out intermediate results exactly
matching those in the example. So far so good. Now I changed the
two constants that in theory should change the example check-sum into the
CRC-16 checksum. The result was that I got different numbers from
my algorithm than from the posted one. Could this be because of some
Big-endian/Little-Endian kind of disagreement? Does the posted algorithm
in effect shift the bits out least-significant bit first, for example?
Also the book specifies that the input is first tagged at the end with
16 zero-bits before doing the calculation. Does the algorithm published
on the net effectively do that? It does not appear to, but I confess I
have not yet figured out why the posted algorithm does anything similar
to what the book describes.

I guess what I am asking for is the "official" definition of
a CRC-16 checksum as it applies to byte-streams rather than bit-streams,
and why I don't get the same numbers that the posted algorithm gets.

In another posting, you'll find a shar of the files under discussion
here, including the "bit-shifting" toy algorithm based on what I
read in the book.


(*) Actually I had to change it a little, because the algorithm described
    was obviously wrong. It says, "A divisor is said to 'go into'
    a dividend if the dividend has as many bits as the divisor."
    I interpreted that to mean instead, "if the dividend and divisor have
    the same base-two log," which is consistent with the example,
    and with the statement that the algorithm can be implemented with
    "a simple shift register circuit".

ron@scocan.sco.COM (Ron Irvine) (11/06/90)

> in my utilities library. His routine was based on two sixteen-entry
> tables, because, he said, they would fit in cache. Well one
> machine I use has a much larger cache than that, and the other
> has none, so I decided just for the fun of it to convert the algorithm
> to one based on one 256-entry table. I did that, and testing
> has verified that it produces the same results.

Yes a 256 table can be faster if it is in cache.  But if you
are hashing a string for lookup in a table you may boot out some
other useful data in the cache.  The nibble table (2 * 16) has
a better chance of cache hits than the sparse 256 table, and less
chance of booting out other useful data, but it takes more work
to calcutate a crc.  Your mileage may vary.

> Now then, for purposes of documentation, and to verify the original
> algorithm, (and for my own education I guess), I decided to figure
> out what the thing actually does.

There are two popular 16 bit crc calculations.  The one I
used is the crc16 used for DDCMP and Bisync communications.
Its polynomial is: x^16 + x^15 + x^2 + 1, with an initial
value of 0. I call it crc16().  The other common crc is used
in ADCCP, HDLC and SDLC; its polynomial is: x^16 + x^12 + x^5 + 1,
with an initial value of -1.  I call it ccitt(). If you are
using the crc to hash you may want to play with both or invent
your own.


-----------  Here is some more code  -----------------

/*	Computing a POLY number from the crc equation.
 *	Crc's are usually expressed as an polynomial expression such as:
 *
 *		x^16 + x^12 + x^5 + 1
 *
 *	Since the highest order term is to the power 16 this is a
 *	16 bit crc.  The POLY number is determined by setting bits
 *	corresponding to the power terms in the polynomial equation
 *	in an integer.  To do this we number the bits of the integer
 *	with the most significant bit = bit 0. CAUTION: this is the
 *	opposite to the some bit numbering schemes.  This is due to
 *	the least significant bit first convention of crc calculations.
 *	The above equation becomes:
 *
 *	  1   0   0   0   0   1   0   0   0   0   0   0   1   0   0   0
 *	|___.___.___.___|___.___.___.___|___.___.___.___|___.___.___.___|
 *	  0   1   2   3   4   5   6   7   8   9   10  11  12  13  14  15
 *	msb                                                          lsb
 *
 *	Note: the "1" term is equivalent to "x^0" and the "x^16"
 *	term is ignored (it does determine the length to be 16 bits).
 *	Thus the POLY number is 0x8408 (ccitt crc).
 */

/* crc16 - DDCMP and Bisync (16 bit)
 *	x^16 + x^15 + x^2 + 1
 */
#define	POLY_CRC16	0xa001
#define	INIT_CRC16	0

/* CCITT - ADCCP, HDLC, SDLC (16 bit)
 *	x^16 + x^12 + x^5 + 1
 */
#define POLY_CCITT	0x8408
#define	INIT_CCITT	-1


/* a bit at a time */
unsigned crc_16(in)
	char *in;
{
	register unsigned short crc, bit;
	register int i;

	crc = INIT_CRC16;
	while (*in) {
		for (i=1; i <= 0x80; i<<=1) {
			bit = (((*in)&i)?0x8000:0);
			if (crc & 1) bit ^= 0x8000;
			crc >>= 1;
			if (bit) crc ^= (int)(POLY_CRC16);
  		}
		++in;
	}
	return crc;
}


/* For even more fun - ccitt() crc routine */

/* nibble table for CCITT crc */
unsigned int ccitt_h[] = {
0x0000, 0x1081, 0x2102, 0x3183, 0x4204, 0x5285, 0x6306, 0x7387,
0x8408, 0x9489, 0xa50a, 0xb58b, 0xc60c, 0xd68d, 0xe70e, 0xf78f, };

unsigned int ccitt_l[] = {
0x0000, 0x1189, 0x2312, 0x329b, 0x4624, 0x57ad, 0x6536, 0x74bf,
0x8c48, 0x9dc1, 0xaf5a, 0xbed3, 0xca6c, 0xdbe5, 0xe97e, 0xf8f7, };

unsigned short
ccitt(in, size)
	register unsigned char *in;
{
	register unsigned int	n, crc;

	crc = -1;
	while (*in) {
		n = *in++ ^ crc;
		crc = ccitt_l[n&0x0f] ^ ccitt_h[(n>>4)&0x0f] ^ (crc>>8);
	};
	return(crc);
}