[net.unix-wizards] 4.1/4.2 ip/tcp incompatiblity?

mark@umcp-cs.UUCP (05/22/84)

We have a 4.2 bsd unix machine and a 4.1 bsd unix machine on the same
ethernet.  The 4.1 runs BBN IP/TCP with Chris Kent's gatewaying code.
This machine also talks to an LHDH to the arpanet.

The problem is that when transmitting a large amount of data from
the 4.2 machine to the 4.1 machine, the 4.1 machine prints out
lots and lots of "tcp checksum" errors.  This only occurs when
attempting to transmit lots of data at once, such as ftp'ing a multi-megabyte
file (or sometimes just by doing an 'ls' to a large directory over telnet!)
The quantity of errors is large enough that tcp usually drops the connection.
Transfers the other way (from 4.1 to 4.2) work fine.

It used to be that the 4.1 machine would crash completely after these
tcp checksum errors, but we found where running out of mbufs was
not being handled properly and now have it down where it only drops
the connection.  But why the tcp checksum errors? (not ip, mind you,
but tcp.  Packets routed out the gateway always work fine). 

Trailers are turned off in our 4.2 machine, so it can't be that.
We thought it might be optional fields in ip headers weren't being
properly handled by the 4.1 code, but careful reading seems to indicate
that they are.  (At least, they are properly being ignored and stripped
before checksumming, as near as we can tell).

These checksum errors are LARGE--multi-many bits (unlike the single bit
ethernet errors we very very occasionally get).  Looks like it has
to be a software bug of some kind, but we are stumped so far.

Suggestions???

-- 
Spoken: Mark Weiser 	ARPA:	mark@maryland
CSNet:	mark@umcp-cs 	UUCP:	{seismo,allegra}!umcp-cs!mark