dove@MIT-BUGS-BUNNY.ARPA (Web Dove) (02/24/85)
The problem was that the ip layer was breaking up 512 byte tcp packets into uncontrolled 488 byte chaos packets. For big correspondence 4 tcp packets (2048b) would be queued at a time so each time a retransmit was tried, they would be broken up into 5 uncontrolled chaos packets and sent out back to back. Since the packets were uncontrolled, and possibly because the gateway was a lowly lsi-11, 5 packets back to back were too much to accept at once. Unfortunately, if one of the early packets is lost (say the second), then no tcp packets can be rebuilt from them. Thus, the connection suddenly drops to 0 throughput. If the second packet is consistantly dropped, then eventually the connection times out. It is unfortunate that the system error for such an event is EBADF as opposed to ECONNABBORTED (which would have led to a more easily deciphered error message. I don't know where that error was generated. The solution was to change tcp_{input,output,subr} to enforce a 400 byte maxseg size. This means that the loss of an individual chaos packet leads to the loss of a single tcp packet, rather than many of them.