cpw%sneezy@LANL.GOV (C. Philip Wood) (09/01/87)
What follows is a TCP scenario observed on our local ethernet. The ftp application (A) has just pushed out the last data block to (B) and is closing the connection. The first FIN from (A) is lost. An ACK from (B) for the last data block arrives. Then there begins a conversation of FIN(A) then ACK(B) with longer and longer time intervals between each set of FIN(A)/ACK(B) as if the ACK is being dropped and a retransmit timer is backing off, firing and a FIN being sent. Notice that the sequence number set in the retransmitted FIN packets is one less than that set in the lost FIN packet. Are both peers broken or just the sender of the FINs? This is a tcpdump. I added an * to indicate the lost packet. (In case your curious, the packet is lost because of an inability to handle back to back packets on the part of B) 15:31:36.86 A > B: S -1063845887:-1063845887(0) win 4096 <mss 1024> 15:31:36.86 B > A: S 253698321:253698321(0) ack -1063845886 win 384 <mss 1024> 15:31:36.86 A > B: . ack 1 win 4096 15:31:37.12 A > B: P 1:146(145) ack 1 win 4096 15:31:37.12 A >* B: F 146:146(0) ack 1 win 4096 15:31:37.38 B > A: . ack 146 win 384 15:31:39.12 A > B: F 145:145(0) ack 1 win 4096 15:31:39.14 B > A: . ack 146 win 384 15:31:41.12 A > B: F 145:145(0) ack 1 win 4096 urg 1 15:31:41.14 B > A: . ack 146 win 384 15:31:45.12 A > B: F 145:145(0) ack 1 win 4096 urg 1 15:31:45.14 B > A: . ack 146 win 384 15:31:53.12 A > B: F 145:145(0) ack 1 win 4096 urg 1 and so on... Phil Wood (cpw@lanl.gov)
cpw%sneezy@LANL.GOV (C. Philip Wood) (09/01/87)
4.3BSD network bug (#9, tcp_output) had a fix for an undetected data loss during connection closing. This may well have fixed the data loss due to lost data segments, but, apparently it will cause the symptom I reported if the data segment with FIN is lost. If the test: if (flags & TH_FIN && tp->t_flags & TF_SENTFIN && len == 0) succeeds the #9 code decremented tp->sndnxt by one. Instead, I set tp->snd_nxt = tp->snd_una, and the symptom went away. I'm not saying this is a fix, but it may point more to the problem. Phil Wood (cpw@lanl.gov)
karels%okeeffe@UCBVAX.BERKELEY.EDU.UUCP (09/02/87)
Right you are! I knew that we had such a problem at one time, but didn't think it had made it out of Berkeley. Your fix is fine; our current version does: if (flags & TH_FIN && tp->t_flags & TF_SENTFIN && tp->snd_nxt == tp->snd_max) tp->snd_nxt--; which should be equivalent. Mike