adnan@sgtech.UUCP (Adnan Yaqub) (09/27/89)
I have a question about the byte order of TCP/IP checksums. I was looking through the Berkeley code, and it seems to me that the byte order of the checksum is not adjusted for depending on the "endianish" of the host. For example, I see the following code: ip->ip_len = htons((u_short)ip->ip_len); ip->ip_off = htons((u_short)ip->ip_off); ip->ip_sum = 0; ip->ip_sum = in_cksum(m, hlen); error = (*ifp->if_output)(ifp, m, (struct sockaddr *)dst); Since the checksum is a 16-bit quantity, it would see to me that one would get a different value depending on whether the host is big or little endian. I looked at in_cksum_c (I don't have a copy of in_cksum), and it doesn't seem to be endian sensitive. (I may be wrong, because the code is quite tricky.) On the experimental side, I wrote some code to send out UDP datagram with the checksum passed through htons. The result was that the receiving host (an Excelan network analyzer) said the checksum was wrong. I sent XXYY and it wanted YYXX. So, how should the checksum be sent? -- Adnan Yaqub Star Gate Technologies, 29300 Aurora Rd., Solon, OH, USA, +1 216 349 1860 ...cwjcc!ncoast!sgtech!adnan ...uunet!abvax!sgtech!adnan
barmar@kulla (Barry Margolin) (09/28/89)
In article <ADNAN.89Sep27101203@sgtech.UUCP> adnan@sgtech.UUCP (Adnan Yaqub) writes: > ip->ip_sum = in_cksum(m, hlen); >Since the checksum is a 16-bit quantity, it would see to me that one >would get a different value depending on whether the host is big or >little endian. I looked at in_cksum_c (I don't have a copy of >in_cksum), and it doesn't seem to be endian sensitive. (I may be >wrong, because the code is quite tricky.) This is due to an interesting feature of 16-bit one's-complement checksums with end-around carry: it automatically does the right thing! It's because the checksum routine is reading the packet in 16-bit chunks in host byte order. Since it isn't converting from network order to host order before summing, it doesn't have to convert back to network order when storing the sum. There's a rigorous proof (which I don't know) that this always produces the right result. I think the end-around carry is the part of the checksum algorithm that does it; it means that carries out of the high-order byte act just like carries out of the low-order byte, being added into the other byte, so the byte order is not significant (an implementation has to be consistent, though). Barry Margolin, Thinking Machines Corp. barmar@think.com {uunet,harvard}!think!barmar
postel@VENERA.ISI.EDU (09/28/89)
Adnan Yaqub: Please see RFC-1071 on "Computing the Internet Checksum". --jon.
ihm@NRC.COM (Ian H. Merritt) (09/29/89)
adnan@sgtech.UUCP (Adnan Yaqub) says: >I have a question about the byte order of TCP/IP checksums. I was >looking through the Berkeley code, and it seems to me that the byte >order of the checksum is not adjusted for depending on the "endianish" >of the host. For example, I see the following code: > > ip->ip_len = htons((u_short)ip->ip_len); > ip->ip_off = htons((u_short)ip->ip_off); > ip->ip_sum = 0; > ip->ip_sum = in_cksum(m, hlen); > error = (*ifp->if_output)(ifp, m, (struct sockaddr *)dst); > >Since the checksum is a 16-bit quantity, it would see to me that one >would get a different value depending on whether the host is big or >little endian. I looked at in_cksum_c (I don't have a copy of >in_cksum), and it doesn't seem to be endian sensitive. (I may be >wrong, because the code is quite tricky.) > >On the experimental side, I wrote some code to send out UDP datagram >with the checksum passed through htons. The result was that the >receiving host (an Excelan network analyzer) said the checksum was >wrong. I sent XXYY and it wanted YYXX. So, how should the checksum >be sent? >-- >Adnan Yaqub >Star Gate Technologies, 29300 Aurora Rd., Solon, OH, USA, +1 216 349 1860 >...cwjcc!ncoast!sgtech!adnan ...uunet!abvax!sgtech!adnan The definitive document on this is RFC1071. In case you don't have it, excerpts follow: In outline, the Internet checksum algorithm is very simple: (1) Adjacent octets to be checksummed are paired to form 16-bit integers, and the 1's complement sum of these 16-bit integers is formed. (2) To generate a checksum, the checksum field itself is cleared, the 16-bit 1's complement sum is computed over the octets concerned, and the 1's complement of this sum is placed in the checksum field. (3) To check a checksum, the 1's complement sum is computed over the same set of octets, including the checksum field. If the result is all 1 bits (-0 in 1's complement arithmetic), the check succeeds. Suppose a checksum is to be computed over the sequence of octets A, B, C, D, ... , Y, Z. Using the notation [a,b] for the 16-bit integer a*256+b, where a and b are bytes, then the 16-bit 1's complement sum of these bytes is given by one of the following: [A,B] +' [C,D] +' ... +' [Y,Z] [1] [A,B] +' [C,D] +' ... +' [Z,0] [2] where +' indicates 1's complement addition. These cases correspond to an even or odd count of bytes, respectively. On a 2's complement machine, the 1's complement sum must be computed by means of an "end around carry", i.e., any overflows from the most significant bits are added into the least significant bits. See the examples below. . . . 2. Calculating the Checksum This simple checksum has a number of wonderful mathematical properties that may be exploited to speed its calculation, as we will now discuss. (A) Commutative and Associative As long as the even/odd assignment of bytes is respected, the sum can be done in any order, and it can be arbitrarily split into groups. For example, the sum [1] could be split into: ( [A,B] +' [C,D] +' ... +' [J,0] ) +' ( [0,K] +' ... +' [Y,Z] ) [3] (B) Byte Order Independence The sum of 16-bit integers can be computed in either byte order. Thus, if we calculate the swapped sum: [B,A] +' [D,C] +' ... +' [Z,Y] [4] the result is the same as [1], except the bytes are swapped in the sum! To see why this is so, observe that in both orders the carries are the same: from bit 15 to bit 0 and from bit 7 to bit 8. In other words, consistently swapping bytes simply rotates the bits within the sum, but does not affect their internal ordering. Therefore, the sum may be calculated in exactly the same way regardless of the byte order ("big-endian" or "little-endian") of the underlaying hardware. For example, assume a "little- endian" machine summing data that is stored in memory in network ("big-endian") order. Fetching each 16-bit word will swap bytes, resulting in the sum [4]; however, storing the result back into memory will swap the sum back into network byte order. Byte swapping may also be used explicitly to handle boundary alignment problems. For example, the second group in [3] can be calculated without concern to its odd/even origin, as: [K,L] +' ... +' [Z,0] if this sum is byte-swapped before it is added to the first group. Better still would be to obtain a copy of the whole document. Good luck. -i -- US Snail: 2380 Rose Avenue; Oxnard, CA 93030 U.S.A. tel. 805-485-2700 InterNet: ihm@NRC.COM
chris@cs.UMD.EDU (Chris Torek) (10/02/89)
Barry Margolin notes that >There's a rigorous proof (which I don't know) that this always produces >the right result. I think the end-around carry is the part of the >checksum algorithm that does it ... It is. In fact, the whole thing becomes obvious (hence not needing anything other than rigorous handwaving :-) ) if you view the bits as being wrapped around a 16-segment cylinder. When carries out of slot 15 are added back to slot 0, it becomes clear that all the bits are treated identically, and the whole thing is isomorphic under arbitrary rotations (as long as the input and output have the same rotation, that is). Chris