gordan@maccs.UUCP (gordan) (03/23/88)
In article <123@heart-of-gold> jc@heart-of-gold (John M Chambers x7780 1E342) writes: -Does anyone have a PD version of ping? How about a C-coded routine that -does the IP checksum calculation? We have one that is written in VAX -assembly language, which is OK for a VAX, but it doesn't work too well -on a 68020.... -------------------------------------------------------------------------- -- One's complement of the one's complement sum checksum in TCP/IP -- In the RFC documents, the "one's complement of the one's complement sum" checksums are mentioned in a single paragraph, and are never even described, an omission that seems incredible. Although the algorithm must be well known to many people, a written description seems to be lacking. So here's an attempt to provide a short description (with examples). The only authority I can cite for the following is myself (but it seems to work, judging from actual TCP/IP packets). Someone kindly let me know if this is horribly wrong. Basically, IP, TCP, and UDP require doing one's complement sums of 16-bit words. That is, you must take a bunch of 16-bit words and sum them (ignoring overflows) _as if their bit patterns represented one's complement numbers_. The trick, then, is doing one's complement arithmetic on a two's complement machine. Without going into any arithmetical justification, here's how to do a one's complement sum on a two's complement machine, in pseudocode: INT16 sum; INT16 *word; /* pointer to start of 16-bit words to be summed */ sum = 0; for (i = 0; i < `number of 16-bit words to be summed'; i++) { `byte-swap word[i], if necessary (see comment on byte-order)' sum += word[i]; /* do NOT combine these two lines ... */ sum += `CARRY'; /* ... into sum += word[i] + CARRY !!!! */ } where CARRY is the value of the hardware carry bit (0 or 1), as set by the addition in the previous line (note you mustn't do sum += word[i] + CARRY as one line, since a high-level language could rearrange the order of addition and add the value of the carry bit before it was set). Of course, the value of the carry bit is not accessible from a higher-level language like C. A perfectly equivalent method (very suitable if your machine has 32-bit integers) is: INT32 sum32, word32 INT16 *word; /* pointer to start of 16-bit words to be summed */ sum32 = 0; for (i = 0; i < `number of 16-bit words to be summed'; i++) { `byte-swap word[i], if necessary (see comments on byte-order)' `copy word[i] to word32, zero-extended (NOT sign-extended)' /* (e.g., 0xedcb -> 0x0000edcb, not 0xffffedcb) */ sum32 += word32; } sum = `add the two 16-bit halves of sum32 to each other' This works, since the carry bit values for 16-bit addition of the least significant 16-bit word accumulate in the most significant 16-bit word of the 32-bit sum. (This is probably what you would use on a 68020 -- and you can forget about byte-swapping on a 68020 as well). After calculating a one's complement sum, you have to take its one's complement (invert all the bits) to get the actual checksum used in IP, TCP, and UDP (but note that UDP treats a calculated checksum of 0x0000 as a special case -- see the RFC). It is of course necessary to take byte-order into account. (Byte-order: if adjacent memory locations on a machine contain the following bytes: X : 0x12 X+1 : 0x34 then what is the value of the 16-bit word whose address is X? (assuming a byte-addressable machine and valid alignment for X to be read as a 16-bit value). If the 16-bit value is 0x1234, the machine is said to be ``big-endian;'' if it is 0x3412, the machine is ``little-endian.'' Some machine architectures (Motorola 680x0, etc.) are big-endian, others (Intel 80x86, VAX) are little-endian. TCP/IP headers use big-endian byte-order. Thus, life is easier on a Sun than on a VAX. Some examples follow, using actual packets (see the appropriate RFC docs for IP, UDP, and TCP, and ignore the Ethernet stuff). In case anyone's curious, the IP addresses here are used on a LAN unconnected to any outside network (they do not respect the class A/B/C Internet naming scheme). An Ethernet UDP/IP packet --------------------------------------------------- 1: ff-ff-ff-ff-ff-ff 02-60-8c-09-58-97 08-00 2: 45 00 3: 00-24 00-01 00-00 ff 11 61-31 01-00-58-97 01-00- 4: -00-00 5: 09-46 00-2a 00-10 c9-ca 6: 01 06 4a 48 45 56 7: 41 58 8: 00 00 00 00 00 00 00 00 00 00 --------------------------------------------------- Line 1: Ethernet header Lines 2-8: Ethernet data Lines 2-4: IP header Lines 5-7: IP data Line 5: UDP header Lines 6-7: UDP data Line 8: Garbage padding to satisfy Ethernet minimum packet size (Ethernet header + data >= 60 bytes). --------------------------------------------------- An Ethernet TCP/IP packet ---------------------------------------------------- 1: 08-00-2b-02-d2-67 08-00-02-00-51-23 08-00 2: 45 00 3: 00-4b 44-46 00-00 1e 06 56-3a 01-00-00-0b 01-00- 4: -00-23 5: 00-17 07-a8 06-14-56-f0 d3-1d-aa-a4 50 18 6: 00-68 b1-d0 00-00 7: 0d 0a 0d 0a 4d 63 4d 61 73 74 8: 65 72 20 55 6e 69 76 65 72 73 69 74 79 20 56 41 9: 58 20 38 36 30 30 0d 0a 0d 10: 00 --------------------------------------------------- Line 1: Ethernet header Lines 2-10: Ethernet data Lines 2-4: IP header Lines 5-9: IP data Lines 5-6: TCP header Lines 7-9: TCP data Line 10: Garbage Ethernet padding (to send an even number of bytes) ---------------------------------------------------- In the first packet, the IP Checksum field is 0x6131 (in the middle of line 3). The IP checksum is calculated over all 16-bit words in the header (except the checksum field itself is taken to be zero, prior to actually calculating it). Thus the 16-bit words that go into calculating the IP checksum are (from lines 2,3,4): 0x4500, 0x0024, 0x0001, 0x0000, 0xff11, 0x0000, 0x0100, 0x5897, 0x0100, 0x0000. The 32-bit sum of zero-extended words is 0x 0001 9ecd, so the one's complement sum is 0x9ece. The one's complement of this is the checksum, 0x6131. The UDP Checksum field in the same packet is 0xc9ca (at the end of line 5). Unlike IP, the UDP checksum is calculated not only over the UDP header, but also over the UDP data, and over a pseudo-header consisting of the IP source and destination addresses, the IP Protocol field zero-extended to 16-bits, and a UDP length word. Again the checksum field itself is taken to be zero during the actual calculation, since we can't know its value before actually computing it. Thus the 16-bit words that go into calculating the UDP checksum are (from lines 5,6,7): 0x0946, 0x002a, 0x0010, 0x0000, 0x0106, 0x4a48, 0x4556, 0x4158; (and from the pseudo-header): 0x0100, 0x5897, 0x0100, 0x0000, 0x0011 (UDP protocol number = 0x11 or 17 decimal), and 0x0010 (the UDP length). The 32-bit sum of zero-extended words is 0x 0001 3634, so the 16-bit one's complement sum is 0x3635 and the checksum is 0xc9ca as required. In the second packet shown, the IP checksum is 0x563a (in the middle of line 3). The 16-bit words that go into calculating the IP checksum are (from lines 2,3,4): 0x4500, 0x004b, 0x4446, 0x0000, 0x1e06, 0x0000, 0x0100, 0x000b, 0x0100, 0x0023. The 32-bit sum of zero-extended words is 0x 0000 a9c5, so the one's complement sum is 0xa9c5. The one's complement of this is the checksum, 0x563a. The TCP Checksum field in the same packet is 0xb1d0 (the second word in line 6). Just as for UDP, the TCP checksum is calculated over all 16-bit words in the TCP header, data, and pseudo-header. The 16-bit words that go into calculating the checksum are: From the TCP header: 0x0017, 0x07a8, 0x0614, 0x56f0, 0xd31d, 0xaaa4, 0x5018, 0x0068, 0x0000 (checksum field itself is initially zero), 0x0000. From the TCP data: 0x0d0a, 0x0d0a, 0x4d63, 0x4d61, 0x7374, 0x6572, 0x2055, 0x6e69, 0x7665, 0x7273, 0x6974, 0x7920, 0x5641, 0x5820, 0x3836, 0x3030, 0x0d0a, 0x0d00 (we have an odd number of data bytes, so the last byte is zero-filled on the right to form a 16-bit word). From the pseudo-header: 0x0100, 0x000b (from the source IP address), 0x0100, 0x0023 (from the destination IP address), 0x0006 (zero-extended IP Protocol word, 0x6 = TCP), 0x0037 (the TCP Length, i.e. the length of the TCP header and data). Here, the 32-bit sum of zero-extended words is 0x 0007 4e28, so the 16-bit one's complement sum is 0x4e2f and the checksum is 0xb1d0, as required. Note the TCP length must be calculated as the total IP Length (0x4b in this case) minus the length of the IP header (5 32-bit words in this case, or 0x14 (decimal 20) bytes). The TCP header itself does not store the number of bytes of TCP data, so the TCP layer relies on the IP layer to supply it with this information. This describes how the sender of a packet calculates the checksum. The receiver, on the other hand, can verify the checksum quickly in the following manner: it simply one's complement sums the 16-bit words and checks if the result is 0xffff (except UDP's special case behavior must again be taken into account) -- if it is, the checksum is correct and the packet data is valid. A moment's thought should show why this works. Recall that when the sender calculated the checksum, the checksum field itself was zero; when the receiver looks at the header, however, the field has been filled in. The checksum is the one's complement of the one's complement sum, and whenever a number and its one's complement are added together, the result is 0xffff. -- Many Americans work side by side with space aliens who look human -- but you can spot these visitors by looking for certain Gordan Palameta tip-offs, say experts. mnetor!maccs!gordan
brian@wb6rqn.UUCP (Brian Lloyd) (03/23/88)
The advantage to using checksums with tcp and ip is that generating a checksum is a relatively low-cost operation (the addition instruction in most processors requires relatively few t-cycles). When you begin to talk about extracting pattern information from a bit stream you begin to talk about an increase in processing overhead that is several orders of magnitude greater. To me the greatest feature of the Internet Protocol Suite is that it tends to espouse the KISS principle: elegant simplicity. I would be loath to bog down the protocol with a lot of additional algorithmic baggage. I can see the point that it would be nice to determine that someone is using an off-by-1 counter or has a pointer misalignment problem, but that is not what goes on most of the time. The things you really want to know in an operational network are link quality, throughput, packet loss rate, and delays. What we REALLY need is a protocol that allows us to gather information from the LLPs for transmission so a network control center (either centralized or distributed). We may also want to discuss the use of forward error correction over some links so that you can extract additional information from the bitstream but we would probably want to do that at the link layer where it would be sensible to put it in a front-end processor. I guess that I am responding to a suggestion that smacks of making things more complex. What we really need to do is to make things simpler so that we can make things faster and more reliable. Additional complexity should be compartmentalized in such a way that it can be safely ignored :-). Brian Lloyd, President Sirius Systems, Inc. (301) 540-2066 {bellcore, syscad, cp1, irs3, n3dmc}!wb6rqn!brian Share and enjoy!
karn@thumper.bellcore.com (Phil R. Karn) (03/24/88)
Thanks for an excellent tutorial on checksumming. Some additional notes: The Internet checksum algorithm has an interesting property that is very useful when working on little-endian machines: the sum of the byte-swapped words is equal to the byte-swapped sum of the words. In other words, there is no need to byteswap each word on a little-endian machine before summing it; you can sum the words in machine order and then byteswap the result. If you write an "assist" routine in assembler you can exploit the "add with carry" instruction found in many machines. E.g., on the 808x family, you can accumulate a sum as follows: doit: lodsw adc dx,ax loop doit (once you've set everything up, of course). This can speed up checksumming enormously on 16-bit machines like the PC since the long (32-bit) arithmetic you would normally use to accumulate the carries is much slower. The actual code I use goes further, in that I unwound the loop and used "Duff's Device" to handle blocks that are not integral multiples of the loop length. Phil
cpw%sneezy@LANL.GOV (C. Philip Wood) (03/24/88)
> What we REALLY need is a ... What we could have used was complience, by the many "rush to networking" Vendor's out there, with the Internet IP / ICMP protocol suite. Using the IP options, and various ICMP functions, it could be possible to figure out: > "...link quality, throughput, packet loss rate, and delays..." One can get a glimmer about the above now, cause 'some' gateways and hosts provide 'some' of the features, that work 'some' of the time. So, before you/we go off an' invent another protocol which any number of universities will jump on as a weekend project, and then any number of vendors will grab at sometime in its life cycle and purvey to the public as there own with out support, ... FIX IP AND ICMP! Cornett Philip Wood cpw@lanl.gov Los Alamos National Laboratory
dab@oliver.CRAY.COM (Dave Borman) (03/26/88)
I feel compelled to respond to the recent explanation of the TCP/IP checksum algorithm, because the explanation makes things harder than they really are. So how do you calculate the checksum? You add up all the 16 bit words over the data that you want to checksum, and add all the carrys back into the sum. When you are all done, you take the one's complement of the final sum. For outgoing packets, put 0 into the checksum field, calculate the checksum, and fill it in. For incoming packets, the checksum will calculate to 0 if the data is correct. That is it. Sweet and simple. What about machine byte order? You can ignore it. It doesn't matter. That's one of the great thing about this algorithm. Look at it this way: as you pull the words from memory, they get byte swapped into the register. So, all of our sum is done byte swapped. But when you write the sum back out to memory, it swaps it back for you. Another way of looking at the checksum: it sums up all the even bytes into one sum, and all the odd bytes into another. All of the carries from the even byte sum get added into the odd byte sum, and all of the carries from the odd byte sum get added into the even byte sum. The following table explicitly shows that the calculated checksum would be calculated the same by bytes, Big-endian, and Little-endian byte order, and in 16 bits or 32 bits at a time. Big-endian Little-endian byte 0/1 0x00 0x01 0x0001 0x0100 byte 2/3 0xf2 0x03 0xf203 0x03f2 byte 4/5 0xf4 0xf5 0xf4f5 0xf5f4 byte 6/7 0xf6 0xf7 0xf6f7 0xf7f6 ----- ----- ------- ------- sum1: 0x2dc 0x1f0 0x2ddf0 0x1F2DC 0xdc 0xf0 0xddf0 0xf2dc carrys: 1 2 2 1 ---- ---- ------ ------ sum2: 0xdd 0xf2 0xddf2 0xf2dd 1's comp: 0x22 0x0d 0x220d 0x0d22 cksum back to memory: 0x22 0x0d 0x22 0x0d 0x22 0x0d For doing it 32 bits at a time, (like 4.3BSD does): byte 0/1/2/3 0x0001f203 0x010003f2 byte 4/5/6/7 0xf4f5f6f7 0xf5f4f7f6 ---------- ---------- sum1: 0xf4f7e8fa 0xf6f4fbe8 top half: 0xf4f7 0xf6f4 bottom half: 0xe8fa 0xfbe8 ------- ------- sum2: 0x1ddf1 0x1f2dc 0xddf1 0xf2dc carrys: 1 1 ------ ------ sum3: 0xddf2 0xf2dd 1's comp: 0x220d 0x0d22 back to memory: 0x22 0x0d 0x22 0x0d Since I'm explaining all this, let me explain a tricky point for when your data is not contiguous. Assume your data that you are checksuming is in two pieces, and further that the the first chunk ends on an odd byte boundry, and the second chunk begins on an even byte boundry. You can calculate this in two pieces: one checksum for each piece of data. Then, you swap the bytes of the second piece, and add it into the first piece to get your final sum. (Or swap bytes on the first piece, do the sum, and swap the bytes again. Depending on the implementation, one or the other may be faster...) This gives you the speed of doing word aligned operations, and you only need to add a minimal amount of processing. (note: when the byte count is odd, you fill in the missing part with zeros) 0/1 0x00 0x01 0x0001 2/ 0xf2 0xf200 ------ sum1: 0xf201 4/5 0x03 0xf4 0x03f4 6/7 0xf5 0xf6 0xf5f6 8/ 0xf7 0xf700 ------- sum2: 0x1f0ea sum2: 0xf0ea carry: 1 ------ sum3: 0xf0eb sum1: 0xf201 sum3 byte swapped: 0xebf0 ------- sum4: 0x1ddf1 sum4: 0xddf1 carry: 1 sum5: 0xddf2 This got a little longer than I expected, but hopefully this should clear things up some more. -Dave Borman CRAY Research, Inc.
braden@venera.isi.EDU (03/26/88)
The TCP/IP checksum properties were discussed early in the TCP/IP development. See "TCP Checksum Function Design", IEN45, by Bill Plummer of BBN; the data was almost 10 years ago-- June 1978. Bob Braden
VAF@SCORE.STANFORD.EDU (Vince Fuller) (03/29/88)
The TCP/IP checksum properties were discussed early in the TCP/IP development. See "TCP Checksum Function Design", IEN45, by Bill Plummer of BBN; the data was almost 10 years ago-- June 1978. Bob Braden Are there references to IEN45 in the IP/TCP/UDP specs? If not, there probably should be. As someone who once was looking for the definition of the checksum algorithm, I can say that a pointer to IEN45 would have been useful. Vince Fuller, Stanford Networking -------
braden@VENERA.ISI.EDU (03/29/88)
Several have pointed out that IEN45 on checksums is not online. It is true. Jon Postel and I are getting that fixed, and as soon as it is fixed, I will send a short note to this list. IEN45, while interesting, does not contain the nifty fact that Phil Karn pointed out: the checksum is invariant to arbitrary byte permutations. I don't know where that is documented; it was pointed out to me in 1983 by Peter Higginson of University College London, and enabled me to speed up the IBM MVS ACP checksum routine by about a factor of 2. This is probably one of the great underground tricks in TCP implementations (or do they teach it to undergraduates these days??). We should turn some of the recent messages on this topic into an RFC, or include them as an Appendix to the Host Requirements RFC which is now being drafted. Any volunteers to write about checksum implementations? Bob Braden
Mills@UDEL.EDU (03/30/88)
Bob, Fuzzballs have a long memory. IEN45.TXT is now on faithful fuzzball udel2.udel.edu in the file du3:ien45.txt and is accessable via anonymous FTP with password guest. You probably wouldn't even be surprised about the incredible dusty archives that are still living on du3: after then ten years of Internet wars. Snarfers are welcome, but please be gentle, since there is only one 9600-bps line accessing that fuzzy at my home and it is used for lots of other things. Dave
braden@VENERA.ISI.EDU (03/31/88)
Dave, Yes, indeed, IEN45 is alive and well on udel2 as claimed! I'm impressed at what long memories Fuzzballs have. And big disks, too, apparently. We will make sure the NIC gets a copy of IEN45 for their public files, which will relieve your 9.6bps path. Bob Braden
dab@oliver.CRAY.COM (Dave Borman) (03/31/88)
> Having come in at only at the tail end of this discussion, I don't know > if anyone has pointed out that, assuming a packet length of < 2**16-1 > words, the checksum algorithm can wait to add back all the carry bits > until *after* the checksum loop is completed. I.e.: > > while (computing-the-checksum) { > checksum += buf[i++]; > } > checksum = checksum && 0xFFFF + (checksum >> 16); > checksum = checksum && 0xFFFF + (checksum >> 16); /* sic - the first carry */ > /* "add back" may cause */ > /* a second carry out */ You can only wait to do the folding in of checksum overflow if you are adding 16 bit quantites in a 32 bit sum. The BSD implementation adds 32 bits at a time in a 32 bit sum, so you have to add in the carry bit as you go. On just about any machine it will probaboy be faster to do one memory reference to get the 32 bit quantity, and then add in the carry, rather than doing two 16 bit memory references. (As a side note, on a CRAY computer you do not have a carry bit, so we do the sum by reading up 64 bit quantities, splitting them into two 32 bit quantites, and summing into a 64 bit sum. We then add in all the carry bits at the end.) -Dave Borman CRAY Research, Inc.
CLYNN@G.BBN.COM (03/31/88)
Dave, I've never used a CRAY, but assuming it has unsigned arithmetic, would it be faster to read & split 64-bit quantities similar to sum += (((unsigned) DATA) >> 32) + (0xFFFFFFFF & DATA); or to if ( (sum += DATA) < DATA ) sum++; ? Charlie
droms@BKNLVMS.BITNET (04/01/88)
Having come in at only at the tail end of this discussion, I don't know
if anyone has pointed out that, assuming a packet length of < 2**16-1
words, the checksum algorithm can wait to add back all the carry bits
until *after* the checksum loop is completed. I.e.:
while (computing-the-checksum) {
checksum += buf[i++];
}
checksum = checksum && 0xFFFF + (checksum >> 16);
checksum = checksum && 0xFFFF + (checksum >> 16); /* sic - the first carry */
/* "add back" may cause */
/* a second carry out */
Using this technique, and unrolling the loop to do 32 additions in line,
I was able to reduce the average number of instructions per short word
from ~10 to 3+ on an IBM RT PC (since the RT is a register-to-register
RISC CPU, 3 instructions per short word is the minimum to load the word,
add to the checksum and increment the pointer). This optimization
reduced the time spent in the RT 4.2BSD checksum routine from 10% to
about 1% on large TCP data transfers, with a resulting increase in
throughput of about 10%.
- Ralph Droms
Department of Computer Science
Bucknell University
droms@bknlvms.bitnet
kline@UXC.CSO.UIUC.EDU (Charley Kline) (04/02/88)
When I first started the CTSS TCP/IP project, Dave Clark suggested that I publish the checksum algorithm. His contention was that most checksummers are dreadfully machine-dependent, incomprehensible, and artful things. Well, I didn't put much effort into art, but I thought I'd oblige him. What I do to compute checksums on a Cray is a vector operation in assembler of all things. I can do up to 512 bytes of packet at a time. The core of the algorithm looks as follows. A1 holds the address of the 512-byte block of memory to checksum (I'm leaving out many details having to do with short blocks, because they are ugly). EBM A0 A1 VL 64 use full vectors S1 <32 form 32-bit mask from the right. A2 32 V1 ,A0,1 load packet into V1 V2 S1&V1 Form right-hand 32-bits in V2. V3 V1>A2 Form left-hand 32-bits in V3. V1 V2+V3 Add the two together. A2 63 Prepare to collapse into a scalar. S1 0 S4 <16 Form 16-bit mask from the right. A4 16 CK$LOOP S2 V1,A2 A2 A2-1 A0 A2 S1 S1+S2 JAN CK$LOOP S2 S1&S4 Form right-hand 16-bits in S2 S1 S1>A4 Form left-hand 16-bits in S1 S1 S1+S2 S2 S1&S4 Form right-hand 16-bits in S2 S1 S1>A4 Form left-hand 16-bits in S1 S1 S1+S2 S1 #S1 Take one's complement CMR At this point, S1 contains the checksum. Note that this takes full advantage of the fact that the one's complement sum is an Abelian Group--I can add 32-bit pieces instead of 16-bit pieces, as has been mentioned. I can also take advantage of the fact that the sum of the checksums of several blocks is the same as the checksum of the sum by carrying out this checksum on several blocks and adding up the answers. First I load two copies of the packet into two vector registers. One gets vector-shifted right 32 bits; the other gets vector-ANDED with a 32 bit mask. Then the two are added together. Since all these operations chain, I can produce one result per clock cycle. Then I collapse the results in the vector by adding each element to a scalar register. Finally, the end-around carry and conversion to 16-bits. It's all terribly fast. Now if only context-switching and IPC under CTSS were as fast... ----- Charley Kline, University of Illinois Computing Services kline@uxc.cso.uiuc.edu {ihnp4,uunet,pur-ee,convex}!uiucuxc!kline "Birth. School. Work. Death."
philipp@LARRY.MCRCIM.MCGILL.EDU (Philip Prindeville [CC]) (04/04/88)
[ My apologies for following this up several days late. I am only now starting to catch up on my mail. ] It seems to me that a lot of the discussion over the last couple of weeks has been about designing an error detection that does protocol verification (or aids in debugging, anyway) by doing index-off-by-one detection. Now this is a neat idea (on paper), but I wouldn't want to implement it (at least not entirely in software). Or if I did, I would do it in a protocol analyzer, not in a production implementation... I still believe the greatest aid in finding index-off-by-one and similiar problems is not done by building elaborate debugging into an implementation in an on-line environment (eg. the Internet), but in the TCP/IP bake-offs. I just wish more implementors would perform complete tests (options and everything) against a real variety of implementations before going to market. "Do I work against a Sun? Check. Do I work against a Wollongong? Check..." In a perfect world... -Philip
CERF@A.ISI.EDU (04/05/88)
Philip: I introduced the "off by one" idea, tongue in cheek, in a private message to Jack Haverty who then took up the challenge... actually, I don't think we are ready for clairvoyant software, unless maybe Jack has some ideas I haven't heard yet... Vint
Ariel@en-c06.Prime.COM (04/07/88)
Forwarded article follows:
----------------------------------------------------------------
Hi,
I noted your article about the TCP and IP checksum algorithm
on Usenet. Feel free to post this if you feel it is of interest;
I only have CSnet/ARPAnet access.
The algorithm used by Prime's TCP/IP/X.25 software is very
similar to your "psuedo-code". Note that it is not necessary
to add the carries back in until the end. (Prime 50-series
machines have network byte order)
----------------------------------------------------------------
int checksumIP(ip)
IP_header *ip;
{
uns16 *word;
uns32 sum;
int count;
ip->IP_checksum = 0;
count = ip->IP_IHL * 2;
sum = 0;
word = (uns16 *)ip;
/* magic occurs ... */
while (count--) sum += *word++;
while (sum > 0xFFFF) sum = (sum & 0xFFFF) + (sum >> 16L);
sum = (~sum & 0xFFFF);
ip->IP_checksum = sum;
return;
}
----------------------------------------------------------------
Note that most of the work is in the single instruction while
loop, with a non-complex termination test. The second while
loop is executed at most twice.
A 9955 or 6350 running this in CIX mode blows the doors off of
any poor controller micro-processor.
Robert Ullmann
Prime.COM zone/domain adminitrator
Ariel@en-c06.Prime.COM