[comp.sys.next] TCP layer fails to ACK occasionally

abe@mace.cc.purdue.edu (Vic Abell) (04/01/89)

The TCP layer of the NeXT OS occasionally fails to ACK data packets.  This
is most visible when using a pager via a remote login.  The pager display
will stall in the middle of the page, sometimes for as long as several
seconds.  The remote peer has been at various running 4.3BSD-Tahoe, ULTRIX
2.2 host, and a DYNIX 3.0.14.  The problem may persist for 30 minutes and
then vanish for hours, on no schedule that seems to be related to host or
network load.

We have used an Ethernet packet sniffer to trace the problem to a missing
NeXT ACK.  The problem is exacerbated by the use of the pager in the remote
peer on files that can be transmitted in more than one, but less than two,
minimum size (512 byte) segments.  The peer sends a cursor movement packet
of 7 bytes, the first 512 byte segment, and then waits for an ACK, because
it does not have enough data for another full segment and because it has
pending, unACKed data.

It is entirely possible that the NeXT TCP layer is using delayed ACKs, which
can be piggybacked with segments returning from the NeXT to the peer, or, if
there are none, which should be sent no later than 200 milliseconds after the
data segment has been received from the peer.  We suspect that the timer
handling is sometimes incorrect, perhaps because of precision, modulus, or
range errors.

Absence of sources cost us excessive and unnecessary time to confirm that
this problem was in the NeXT software and not in our Ethernet hardware.