mark@umcp-cs.UUCP (05/22/84)
We have a 4.2 bsd unix machine and a 4.1 bsd unix machine on the same ethernet. The 4.1 runs BBN IP/TCP with Chris Kent's gatewaying code. This machine also talks to an LHDH to the arpanet. The problem is that when transmitting a large amount of data from the 4.2 machine to the 4.1 machine, the 4.1 machine prints out lots and lots of "tcp checksum" errors. This only occurs when attempting to transmit lots of data at once, such as ftp'ing a multi-megabyte file (or sometimes just by doing an 'ls' to a large directory over telnet!) The quantity of errors is large enough that tcp usually drops the connection. Transfers the other way (from 4.1 to 4.2) work fine. It used to be that the 4.1 machine would crash completely after these tcp checksum errors, but we found where running out of mbufs was not being handled properly and now have it down where it only drops the connection. But why the tcp checksum errors? (not ip, mind you, but tcp. Packets routed out the gateway always work fine). Trailers are turned off in our 4.2 machine, so it can't be that. We thought it might be optional fields in ip headers weren't being properly handled by the 4.1 code, but careful reading seems to indicate that they are. (At least, they are properly being ignored and stripped before checksumming, as near as we can tell). These checksum errors are LARGE--multi-many bits (unlike the single bit ethernet errors we very very occasionally get). Looks like it has to be a software bug of some kind, but we are stumped so far. Suggestions??? -- Spoken: Mark Weiser ARPA: mark@maryland CSNet: mark@umcp-cs UUCP: {seismo,allegra}!umcp-cs!mark