[comp.protocols.tcp-ip] A TCP Bug?

subbu@TOVE.UMD.EDU (MCV Subramaniam) (01/18/89)

Hi,

I am working with TCP/IP for HP UNIX systems. We seem to have hit upon a 
case in which TCP goes in an infinite loop. I am describing the case below:

Consider TCP_A and TCP_B communicating. TCP_B has dropped some data sent
from TCP_A. Suppose TCP_A has its state variables as : <snd_una,snd_nxt,
snd_max> as <100, 300, 300>. When the timer pops, a segment <100,200> is 
retransmitted (assuming that the segment size is 100--for simplicity). In
the meantime, TCP_B sends some data over to TCP_A. This data arrives after
the retransmission is done. At this stage, TCP_A has its state variables
as <100, 200, 300>. When TCP_A sends an ACK for the data, the sequence
number in the packet is going to be 200 (at least, 4.3BSD does this).

TCP_B, in the meanwhile has received the retransmitted data, and has 
updated its rcv_nxt to 300 (it had a hole in the reassembly queue). Now,
when the ACK from TCP_A comes in with a sequence number 200, it is 
dropped!

Now consider the scenario when both TCP_A and TCP_B have dropped each 
other's data, and have retransmitted at the same time. Both their acks are
going to be dropped, and counter ACKs are going to be generated for
both of them (In tcp_input(), control is tranferred to dropafterack!)

Thus there will only be these ACKs going back and forth, and no useful
work is done.

This is a rare possibility, but then it occured on one of our machines
(with the two TCPs being on the same machine) and caused our system
to hang. We have now found out that we got into such a situation
because of another bug (we were dropping packets unnecessarily), but
then this *can* happen in reality.

A possible workaround for this can be in the retransmit code, to
save the value of snd_nxt vbefore calling tcp_output(), and then
restore it on return. But then this has problems when communicating
with a TCP that does not have reassembly queues (performance problems
-- we have tried it out).

A better solution may be to put in a better (?) value of sequence number
in a pure ACK packet.

Do other flavors of TCP have this problem? Or is it only on 4.3?
I would be interested in getting comments in this regard.

Thanks.

-Subbu (subbu@hpindlm.hp.com)
(408)447-2693.