[comp.protocols.tcp-ip] TCP checksums

fish@hpctdlb.HP.COM (Dave Fish - Marketing) (05/28/91)

I'm interested in how common it is for TCP implementations to use all zeros
for the TCP header checksum.  I know that some HP machines do this but 
how common is this in the real world?

henry@zoo.toronto.edu (Henry Spencer) (05/29/91)

In article <3270025@hpctdlb.HP.COM> fish@hpctdlb.HP.COM (Dave Fish - Marketing) writes:
>I'm interested in how common it is for TCP implementations to use all zeros
>for the TCP header checksum.  I know that some HP machines do this but 
>how common is this in the real world?

Not very, I hope.  It has never been legal.  UDP allows omission of the
checksum by this means; TCP does not.  RFC 1122:

         4.2.2.7  TCP Checksum: RFC-793 Section 3.1

            Unlike the UDP checksum (see Section 4.1.3.4), the TCP
            checksum is never optional.  The sender MUST generate it and
            the receiver MUST check it.
-- 
"We're thinking about upgrading from    | Henry Spencer @ U of Toronto Zoology
SunOS 4.1.1 to SunOS 3.5."              |  henry@zoo.toronto.edu  utzoo!henry

barmar@think.com (Barry Margolin) (05/29/91)

In article <3270025@hpctdlb.HP.COM> fish@hpctdlb.HP.COM (Dave Fish - Marketing) writes:
>I'm interested in how common it is for TCP implementations to use all zeros
>for the TCP header checksum.  I know that some HP machines do this but 
>how common is this in the real world?

You must be talking about the *UDP* checksum, which is optional.  The TCP
checksum isn't optional.

Most BSD-derived Unix systems have a kernel variable that controls whether
UDP checksums are generated and checked (it's generally called something
like "udp_cksum", but I've seen variants without the underscore).  In
recent SunOS releases it's configurable in a header file at kernel build
time.  SunOS defaults to having checksums off.
-- 
Barry Margolin, Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar

zweig@parc.xerox.com (Jonathan M. Zweig) (05/30/91)

In 1's complement arithmetic there are two ways of writing "zero". In C they
are 0x0000 and 0xffff (16-bit).  I can't see any reason why the checksum would
need to always come out to be nonzero (i.e. 0x0000 could happen).

Consider the 16-bit ones complement of the 16-bit ones complement sum of a
bunch of numbers that happen to add up to 0xffff (such as, say, 0xff00 and
0x00ff with a bunch of 0x0000's too). Yikes! It is all zeroes.

The TCP checksum is never optional (though you can use a different one if you
like, by supporting RFC1146-style checksum algorithm negotiation), but it is
not clear to me that it cann never be all zeroes.

In fact, I can't figure out a way that the checksum would ever be 0xffff
("negative zero").  If you can think of a set of 16-bit values whose TCP
checksum is 0xffff let me know.  Since 0xffff + 0xffff = 0xffff (and not
0x000), I can't figure out a sum that comes out to 0x0000 in order to get
complemented to 0xffff by the vanilla TCP checksum algorithm.

-Johnny Checksum

zweig@parc.xerox.com (Jonathan M. Zweig) (05/31/91)

In fact, I can prove that if the ones-complement arithmetic is done the way I
learned it at my mother's knee, then the TCP checksum field will never contain
0xffff, though it can contain 0x0000.  I think it was dumb for UDP to use
0x0000 as the "no checksum" value, when 0xffff is impossible (i.e. you could
implement a checksum algorithm that doesn't mess around with substituting -0
for 0 since it will never happen in a real datagram).

The proof is based on the fact that the ones complement (1C) sum of a set of
numbers is 0 (I will use 0 to mean 0x0000 and -0 to mean 0xffff) if and only
if all of the numbers are 0.

(<==) trivial.  Add 0 to 0 and you get 0.
(==>) observe that if the set of numbers to be added contains any nonzero
(i.e. not 0x0000) number, there will be a 1 in at least one of the bit
positions of one of the addends.  The only way the corresponding bit in the
sum could be 0 is if there are an even number (>=2) of 1's in that column of
the sum.  But wait! That means there will be a carry out of that position.
Since 1C addition has end-around carry (i.e. you can perform the addition
bitwise was long as you take the carry bit out of the leftmost single bit
addition and add it back to the sum), that carry bit can't "go away". That is,
the only way you will stop carrying is by adding the 1 to a bit of the sum
that is 0.  There is no way in 1C addition to add 1 to anything and have it
end up as 0 (it will be -0, possibly, but never 0).  This means that the carry
bit out the end of any pairwise addition results in a nonzero sum. By
induction, we see that there cannot exist a set of numbers not all 0 that add
up to 0.

Since RFC793 defines the checksum as the bitwise complement of the 1C sum of
all the 16-bit words in a datagram, we see that since the sum can never be 0,
the complement can never be -0.

I am told that there are brain-damaged (to my mind) ways of implementing 1C
arithmetic that automagically sustitute 0 for -0 if it is the result of an
addition.  So we can blame this boneheadedness with the 0/-0 switcheroo on
them, I suppose.

Does anyone know why UDP chose 0 (rather than -0) to indicate no checksum? Was
someone smoking something?  It seems to me it has complicated (albeit by only
a couple of instructions) every TCP and UDP checksum calculation ever, at a
cost to society of millions of dollars (add up all those cycles for me,
please).

Of course, it's probably far too late to start making a fuss about it now.

-Johnny Checksum

dab@BERSERKLY.CRAY.COM (David Borman) (05/31/91)

> In 1's complement arithmetic there are two ways of writing "zero". In C they
> are 0x0000 and 0xffff (16-bit).  I can't see any reason why the checksum would
> need to always come out to be nonzero (i.e. 0x0000 could happen).
> 
> Consider the 16-bit ones complement of the 16-bit ones complement sum of a
> bunch of numbers that happen to add up to 0xffff (such as, say, 0xff00 and
> 0x00ff with a bunch of 0x0000's too). Yikes! It is all zeroes.
> 
> The TCP checksum is never optional (though you can use a different one if you
> like, by supporting RFC1146-style checksum algorithm negotiation), but it is
> not clear to me that it cann never be all zeroes.
> 
> In fact, I can't figure out a way that the checksum would ever be 0xffff
> ("negative zero").  If you can think of a set of 16-bit values whose TCP
> checksum is 0xffff let me know.  Since 0xffff + 0xffff = 0xffff (and not
> 0x000), I can't figure out a sum that comes out to 0x0000 in order to get
> complemented to 0xffff by the vanilla TCP checksum algorithm.
> 
> -Johnny Checksum

An interesting observation, and not very hard to prove.

If you look at the sequence of numbers incrementing by 1 at
the wrap around point, you have:

		... FFFE, FFFF, 0001, 0002 ...

Notice that 0000 is skipped, because:

		FFFF + 0001 = 10000 = 0001.

So, the only way that the 1's sum would be 0000 would be if each of
w1 through wn were equal to zero.

If you take the 1's complement of the sequence, you have:

		... 0001, 0000, FFFE, FFFD

Which is just the sequence of numbers decrementing by 1, at the
wrap around point.

So, just as addition will never yield a value of 0000, subtraction
will never yield a value of FFFF.  Another way of stating it is that
when the 1's sum wraps, it will give you -0, when the 1's difference
wraps, it will give you +0.

If you wanted to, since 0000 and FFFF are both identity elements, if
the 1's sum was computed to be FFFF, you could replace it with 0000,
and send the 1's complement of it across as the checksum, and everything
will work just fine; on the destination machine the checksum will still
compute as valid.

In fact, if you look at the BSD udp_output() code, if the checksum
is calculated to 0000, it is replaced with FFFF to ensure that the
destination machine will verify the checksum rather than skip the
checksum.  Because the TCP checksum is required, it doesn't bother
to change 0000 into FFFF.

			-David Borman, dab@cray.com

sra@lcs.mit.edu (Rob Austein) (06/03/91)

One algorithm for computing the ones-complement checksum on a
twos-complement machine works by explictly computing a sum modulo
0xFFFF.  The actual summation can be done in a larger word, and the
properties of the ones-complement sum can be preserved by occasionally
subtracting some large multiple of 0xFFFF.  The final step of this
algorithm returns the ones-complement of the quantity (sum mod
0xFFFF).  Thus, this algorithm will always return 0xFFFF instead of
0x0000.  So, at least in this case, making 0x0000 the excluded value
makes sense.

On certain machines (eg, the PDP-10, where normal fixed-point
arithmetic operates on 36-bit words) this is by far the easiest way to
compute the checksum, because the loop can sum 32-bit quantities
instead of 16-bit quantities without changing the result.

For those interested in the nitty-gritty, here's the UDP checksum code
from the CHIVES domain resolver.  The macro L32INT() obtains a 32-bit
integer from a 36-bit word; this involves some bit-shifting, but you
can think of it as a simple array reference.  ZERO_MOD_0xFFFF is the
largest multiple of 0xFFFF which can be represented as a positive
36-bit twos-complement quantity.

The basic algorithm is from the checksum routine in the ITS monitor.

  int udp_chksum(pkt)
      char *pkt;
  {
      int n, sum, *u;
      struct ip_header *ip_h;
      struct udp_header *udp_h;

      /* Initialize pointers, compute data length */
      ip_h = IP_HEADER(pkt);
      u = (int *) (udp_h = UDP_HEADER(pkt));
      n = ip_h->len - (ip_h->ihl * 4);

      /*
       * Initial sum is pseudo-header:
       *  IP source and destination addresses
       *  IP protocol number
       *  UDP data length (which gets added again part of UDP header!)
       * plus whatever bytes are in the last word of the packet buffer.
       */
      sum = ip_h->sh + ip_h->dh + ip_h->pro + udp_h->ln
	  + (L32INT(u[n/4]) & (~0 << (8 * (4 - (n % 4)))));

      /* Sum everything else, folding when necessary. */
      n /= 4;
      while(--n >= 0)
	  if((sum += L32INT(*u++)) < 0)	/* Carried into the sign bit? */
	      sum -= ZERO_MOD_0xFFFF;	/* Yeah, fix that */

      /* Final folding, return complement. */
      return((sum % 0xFFFF) ^ 0xFFFF);
  }

--Rob Austein

tad@wrq.com (Tad Marshall) (06/03/91)

In article <1991May28.221045.27724@Think.COM> barmar@think.com writes:
>>I'm interested in how common it is for TCP implementations to use all zeros
>>for the TCP header checksum.  I know that some HP machines do this but 
>>how common is this in the real world?
>
>You must be talking about the *UDP* checksum, which is optional.  The TCP
>checksum isn't optional.

  Actually, on HP 3000s (at least) the TCP checksum *IS* optional.  Not that
  this makes it "legal", but in the "real world" we do want to work with
  existing implementations.  Such HP implementations definitely exist.

  My understanding of the optional UDP checksum is that a bug in BSD 4.2
  would prevent UDP packets with checksums from being understood (I forget
  if the bug was on the sending or the receiving side).  In any case, most
  implementors would turn off UDP checksumming (i.e. send 0000) in order to
  not hit this bug.

  In answer to the original question, I think that HP is unique in permitting
  TCP checksumming to be turned off.  On *incoming* TCP sessions, a 3000 will
  decide to use real checksumming if the incoming SYN packet has a real
  checksum and will use a checksum of zero if the incoming SYN packet has a
  zero checksum (if checksumming has been turned off on the 3000).
  On *outgoing* sessions, the 3000 will use a zero checksum when checksumming
  is off.  As expected, this can cause interoperability problems with non-HP
  systems.

Tad Marshall -- software developer -- Walker Richer & Quinn, Inc.  Seattle, WA

louie@sayshell.umd.edu (Louis A. Mamakos) (06/06/91)

In article <9105311628.AA24660@berserkly.cray.com> dab@BERSERKLY.CRAY.COM (David Borman) writes:
>So, just as addition will never yield a value of 0000, subtraction
>will never yield a value of FFFF.  Another way of stating it is that
>when the 1's sum wraps, it will give you -0, when the 1's difference
>wraps, it will give you +0.

Err.., excuse me.  Those of us with hardware that actually does one's
complement (horrors!) arithmentic can depend on the hardware never
generating a -0 as the result of any arithmetic operation.  So, in
fact, I do get checksums computed with a value of 0x0000, which might
be turned into a 0xffff when you take the one's complement.

Of course, my hardware also have 9 bit bytes...

louie