tcp-ip@ucbvax.ARPA (08/08/85)
From: Ken Harrenstien <KLH@SRI-NIC.ARPA> Here is another TOPS-20 TCP/IP bug fix, thanks to Charlie Lynn. We have been running the fixes at SRI-NIC for a couple days and I haven't seen any instances yet of the lossage that previously plagued us - sometimes duplicate data would be received on a TCP connection. This is annoying for TELNET, and highly dangerous for FTP! I believe this bug also explains the phenomenon of duplicate text being received on a telnet connection to 4.2BSD systems, which some time ago was mentioned and was blamed (wrongly, it appears) on a defective 4.2BSD server telnet implementation. Proof of TOPS-20 lossage came from testing connections to MIT-MC while using the ITS packet logging features. --KLH --------------- Return-Path: <CLYNN@BBNA.ARPA> Received: from BBNA.ARPA by SRI-NIC.ARPA with TCP; Thu 1 Aug 85 10:46:33-PDT Date: 1 Aug 1985 13:49-EDT Sender: CLYNN@BBNA.ARPA Subject: Re: [Ken Harrenstien <KLH@SRI-NIC.ARPA>: It's definite - TOP... From: CLYNN@BBNA.ARPA To: KLH@SRI-NIC.ARPA Cc: clynn@BBNA.ARPA Message-ID: <[BBNA.ARPA] 1-Aug-85 13:49:21.CLYNN> In-Reply-To: The message of Sun 28 Jul 85 05:05:02-PDT from Ken Harrenstien <KLH@SRI-NIC.ARPA> Ken, Enough of the pieces fit for me to conclude that the duplication problem you documented is an instance of an old bug (it was fixed by the time BBN gave DEC updated files in October '84, but they haven't been distributed). The scenario of what is happening is that when a packet (e.g., ID 77475 in your example) is emptied and a buffer is simultaneously filled (e.g., PUSH was set) and another packet is available (77476) but there are no available buffers, then the "partial packet" flag (TRPP) is being set with a "processed byte count" (TRCBY) of zero. Then, when no buffer has yet been made available (e.g., slow net/output to a terminal in TN) by the time a retransmission arrives which includes "old" data before the sequence number in the partial packet (e.g., 77500), PRCPKT replaces the original packet with the larger retransmitted packet. Unfortunately, it fails to modify TRCBY, so that when a buffer finally arrives, data is removed beginning at the wrong offset. (The comment 'N.B. It works to replace a "partial packet" with a bigger one' is a lie.) The solution is to forget about TRCBY. Patches to fix the problem are: REASM3: LOAD RCVLFT,TRLFT,(TCB) JE TRPP,(TCB),REASM4 ; Jump if not continuing a p (jfcl) LOAD BYTNUM,TRCBY,(TCB) ; Where to resume in this pa jrst reas11 JRST REAS13 ; Go process the remainder ----------- ; Setup BYTNUM to be the byte number within the packet where ; handling should start. reas11: LOAD RCVLFT,TRLFT,(TCB) ; Get updated copy MOVE BYTNUM,RCVLFT ; Next to be reassembled LOAD T1,PSEQ,(TPKT) ; Start of packet SUB BYTNUM,T1 ; Offset into data JUMPLE BYTNUM,REAS12 ; No control to worry about LOAD T1,PSYN,(TPKT) ; Get value of SYN bit SUBI BYTNUM,0(1) ; Discount space taken by SY REAS12: ; Setup XFRCNT to be the number of bytes to transfer out of ; packet into the user buffer. jrst pat REAS13: LOAD XFRCNT,PIPL,(PKT) ; Get total length LOAD T1,PIDO,(PKT) ; Number of words in Interne pat: tlz q1,740000 (=MODSEQ BYTNUM , which should be at reas12, but here is ok) LOAD XFRCNT,PIPL,(PKT) ; Get total length jrst reas13+1 --------------- REAS19: ; Save the partial packet for the next time through. SETONE TRPP,(TCB) ; Set the partial packet wai ADD RCVLFT,XFRCNT MOVE T1,XFRCNT ; Number transferred STOR RCVLFT,TRLFT,(TCB) ADD T1,BYTNUM ; Where the transfer started jfcl STOR T1,TRCBY,(TCB) ; Is where to resume in the JUMPN BYTNUM,REAS20 ; First time we have JE PSYN,(TPKT),REAS20 ; Seen a packet with a SYN i jfcl ADD RCVLFT,XFRCNT ; Yes. Update Left jfcl STOR RCVLFT,TRLFT,(TCB) MOVX T1,^D500 ----------- The sources could be cleaned up to eliminate now unused instructions, etc. While you are making changes, check the SNDTVT routine in TTANDV.MAC as the initial value of PKTPTR was computed incorrectly, it should look something like SNDTVT:: ACVAR <XFRCNT,LINBLK,PKTPTR,CNT> PUSH P,[-1] ; Last octet DMOVEM T1,XFRCNT ; T1,2 to XFRCNT and LINBLK MNTM5 AOS CELL(TCVST,0,,TCV) ; SNDTVT calls ** LOAD PKTPTR,PIPL,(PKT) ; Current packet length is next byte to insert ** ADJBP PKTPTR,[POINT 8,PKTELI(PKT)] ; Byte pointer there MOVEI CNT,0 ; Init number moved to packet where the ** instructions were (incorrectly) LOAD PKTPTR,PTDO,(TPKT) ; Get TCP data offset HRLI PKTPTR,(<POINT 8,.-.(TPKT)>) ; Pointer to data area ---------------------------------------- also, after TVTDTT: PUSH P,P2 ; BFR PUSH P,FR ** SETZB FR,T3 ; No flags, nor JCN XMOVEI T1,TCBLCK(TCB) ; Lock to lock XMOVEI T2,CLOSE1 ; Function to call CALL LCKCAL ; Do a cross-job close POP P,FR POP P,P2 ; BFR where the ** was SETZ FR, ; No flags Charlie -------