[comp.protocols.tcp-ip] Help with broken TCP

cpw%sneezy@LANL.GOV (C. Philip Wood) (09/01/87)

What follows is a TCP scenario observed on our local ethernet.  The ftp
application (A) has just pushed out the last data block to (B) and is
closing the connection.  The first FIN from (A) is lost.  An ACK from (B)
for the last data block arrives. Then there begins a conversation of FIN(A)
then ACK(B) with longer and longer time intervals between each set of
FIN(A)/ACK(B) as if the ACK is being dropped and a retransmit timer is
backing off, firing and a FIN being sent.  Notice that the sequence number
set in the retransmitted FIN packets is one less than that set in the lost
FIN packet.  Are both peers broken or just the sender of the FINs?

This is a tcpdump.  I added an * to indicate the lost packet.  (In case
your curious, the packet is lost because of an inability to handle back
to back packets on the part of B)

15:31:36.86  A > B: S -1063845887:-1063845887(0) win 4096 <mss 1024>
15:31:36.86  B > A: S 253698321:253698321(0) ack -1063845886 win 384 <mss 1024>
15:31:36.86  A > B: . ack 1 win 4096
15:31:37.12  A > B: P 1:146(145) ack 1 win 4096
15:31:37.12  A >* B: F 146:146(0) ack 1 win 4096
15:31:37.38  B > A: . ack 146 win 384
15:31:39.12  A > B: F 145:145(0) ack 1 win 4096
15:31:39.14  B > A: . ack 146 win 384
15:31:41.12  A > B: F 145:145(0) ack 1 win 4096 urg 1
15:31:41.14  B > A: . ack 146 win 384
15:31:45.12  A > B: F 145:145(0) ack 1 win 4096 urg 1
15:31:45.14  B > A: . ack 146 win 384
15:31:53.12  A > B: F 145:145(0) ack 1 win 4096 urg 1
 
and so on...

Phil Wood  (cpw@lanl.gov)

cpw%sneezy@LANL.GOV (C. Philip Wood) (09/01/87)

4.3BSD network bug (#9, tcp_output) had a fix for an undetected data
loss during connection closing.  This may well have fixed the data loss
due to lost data segments, but, apparently it will cause the symptom
I reported if the data segment with FIN is lost.  If the test:

	if (flags & TH_FIN && tp->t_flags & TF_SENTFIN && len == 0)
	
succeeds the #9 code decremented tp->sndnxt by one.  Instead, I set
tp->snd_nxt = tp->snd_una, and the symptom went away.   I'm not saying
this is a fix, but it may point more to the problem.

Phil Wood  (cpw@lanl.gov)

karels%okeeffe@UCBVAX.BERKELEY.EDU.UUCP (09/02/87)

Right you are!  I knew that we had such a problem at one time, but didn't
think it had made it out of Berkeley.  Your fix is fine; our current version
does:
	if (flags & TH_FIN && tp->t_flags & TF_SENTFIN &&
	    tp->snd_nxt == tp->snd_max)
		tp->snd_nxt--;

which should be equivalent.

		Mike