netcoor@NCS.DND.CA (DRENET Coordinator) (02/23/89)
Has anyone seen, or can anyone explain this problem? We have users on network 128.43 who has reported trouble retreiving files from several hosts in the Internet. The FTP connection is opened and the user and password are exchanged and the login completed message is received. After this, problems occur for any command for which a data connection is to be opened. Other commands not needing the data connection (eg cd, ascii, binary) work as expected, but commands like get and dir fail. The usual message received is: < 425 Can't build data connection: Connection timed out although the message: < 425 Can't build data connection: Network is unreachable. is also common. After this, cd and other commands not needing data connections still work. This problem is variable in that sometimes it strikes and sometimes it doesn't. Sometimes it will interrupt an already started data transfer (the above messages don't apply in this case). Network 128.43 is gatewayed onto the ARPANET through a Butterfly gateway (10.1.0.15). I am on network 192.12.98, which is also gatewayed through the same Butterfly. I have yet to see the problem affect my host (Ultrix). The host affected on net 128.43 is a DEC 2065 running TOPS-20. What confuses me is that packets are being transferred between the two systems over the command connection throughout, yet any attempt to establish a new connection for data fails. Can anyone explain this? I would sure like to understand what is going on to create this situation, then I can try to do something about it. Thanks. Bob Bradford netcoor@ncs.dnd.ca DREnet Coordinator (613) 998-2520
mrc@SUMEX-AIM.STANFORD.EDU (Mark Crispin) (02/24/89)
I've seen this problem in other forms. Apparently there are a lot of "ICMP Destination Network Unreachable" messages getting sent in instances where network connectivity is broken for only a brief duration (perhaps due merely to congestion at a gateway). Some versions of BSD Unix nuke the connection when this or "host unreachable" occur; it is reputed that patching location _inetctlerrmap+8 in the kernel from 0x3341 to 0x0 remedies this problem but I haven't been able to verify it. Since the "425 Can't build data connection" message is coming from the remote server, it suggests that the problem is occuring when the remote server tries to open a connection to the FTP client on the 128.43 TOPS-20 host. Because of this, I'm inclined to absolve the TOSP-20 host of guilt (particularly in the "Network is unreachable" case), and more likely to blame network routing. Question: could an ICMP Destination Network Unreachable happen when something like an X.25 virtual circuit limitation is reached? -------
reschly@BRL.MIL ("Robert J. Reschly Jr.") (02/24/89)
Mark, A couple of weeks ago BRL noted severe difficulties with connectivity. We were able to trace this to ICMP Network Unreachables (we're a BSD shop), which appeared to be the result of core route "flopping". At the end of this message, I'll tack on the message I sent to BBN on the subject. The raw data file mentioned in that message is still available if anyone is masochisitic enough to want to look at it. When we spoke to BBN the afternoon before sending that message, they told us they had identified a problem with routing, and a message received the next afternoon confirmed that a fix to the problem alluded to in the phone conversation would be fielded in the next few days. By mid-week the following week, connectivity did indeed appear to be better than previously. Since this last change, we still see more routing variability than we feel should be present though it does look better than before the change. One curious thing, has anyone else noticed the EGP peers bouncing in and out? We peer with BMILDCEC, BMILBBN, and BMILMTR in that order (though we only exchange updates with one at any given time), and we are continually having to re-acquire one or more of these beasties (as I write this, the gateway is trying to acquire BMILMTR). Have these gateways been bouncing up and down a lot? We have also started looking at the EGP information we are getting a little more closely, and have seen hopcounts as high as 62(!). In the last few days, our PSN insufficient resource (type 4) messages are haunting us again. We had earlier reported these and BBN reconfigured our PSN with more space allocated to buffers to lessen the severity of that problem. I suppose we'll have to complain about this again. Has anyone else noted any interesting behavior since the change? Later, Bob -------- Phone: (301)278-6678 AV: 298-6678 FTS: 939-6678 Arpa: reschly@BRL.MIL (or BRL.ARPA) UUCP: ...!brl-smoke!reschly Postal: Robert J. Reschly Jr. U.S. Army Ballistic Research Laboratory Systems Engineering and Concepts Analysis Division Advanced Computer Systems Team ATTN: SLCBR-SE (Reschly) APG, MD 21005-5066 (Hey, *I* don't make 'em up!) **** For a good time, call: (303) 499-7111. Seriously! **** ================ Date: Thu, 9 Feb 89 5:51:55 EST From: "Robert J. Reschly Jr." <reschly@brl.mil> To: meason@wash.bbn.com, amalis@bbn.com cc: jcst@BRL.MIL Subject: More Node 29 troubles. Mike, Here is a summary of our recent experience and a copy of Phil's message. First, the incompletes are still with us though they appear to be at the reduced level we noted after the PSN buffer configuration changes. The only note here is that these messages are still coming in at a much greater rate than before our switching to EGP peering with the Buttergates. We are currently seeing these 5 to 10 (on average) times an hour, rather than 5 to 10 times a day. Second, as Phil notes in the enclosed message, we have been suffering from what looks like significant routing instability since switching to EGP peering with the Buttergates. The variability in numbers of reported routes was noted as soon as we switched, but we did not notice any actual reachability problems until a while later. A typical sequence would be: Establish a connection (e.g. FTP, TELNET, rlogin); everything appears fine, connectivity is good and round trip times are reasonable. After a few minutes of operation, suddenly the the connection freezes. The connection usually closes at this time. Attempt to restart the connection -- this usually fails Wait a few minutes, then attempt to restart the connection. This usually succeeds as if there was never any problem. At this point the cycle repeats. Running an experiment with ping shows that the loss of communication coincides with the receipt of ICMP Network Unreachable messages. I ran a ping experiment against louie.udel.edu to see if I could duplicate and record the symptoms today. I'll include a summary from the first part of that at the end of this message, and will put the raw data, (roughly 1.3MB collected over 4 hours between 1800 EST and 2200 EST 8 Feb 1989) in the public FTP area of vgr.brl.mil. Note that since this is a script of a terminal session, there are a few control characters and escape sequences buried in this file. We currently EGP peer with the buttergates at DCEC and CAMBRIDGE as our primary and fallback. I have also made some changes to the gateway software to extract a bit more information but have nothing to present at this time. The raw data is the composite of a 15 second timestamp loop, the ping, and the gateway console all smashed together and intertwingled. The ping generates the "xx bytes" messages as well as the verbose dumps of most other ICMP messages. Much of the gateway console output is prepended by "<process_name>: ", though there are a few messages which are different (e.g. "ICMP redirect" and "UPTIME" messages. The gateway software is of local origin. If you have any questions about any of it, get in touch with us and we will clarify. Finally, you will find a number of "milr: msg with link 27 from 4/48" followed by an equal number of "milr: pack len <value1>, format 15, illen <value2>" messages. The values range over a small set for each. We only started noticing these today, but had not been closely watching the gateway for the few days prior to today. The "link" parameter is the link type from the IMP leader -- we are 1822 connected. I hope this stuff helps. Later, Bob -------- Phone: (301)278-6678 AV: 298-6678 FTS: 939-6678 Arpa: reschly@BRL.MIL (or BRL.ARPA) UUCP: ...!brl-smoke!reschly Postal: Robert J. Reschly Jr. U.S. Army Ballistic Research Laboratory Systems Engineering and Concepts Analysis Division Advanced Computer Systems Team ATTN: SLCBR-SE (Reschly) APG, MD 21005-5066 (Hey, *I* don't make 'em up!) **** For a good time, call: (303) 499-7111. Seriously! **** ----- Forwarded message # 1: Received: from smoke.brl.mil by SEM.BRL.MIL id aa07207; 2 Feb 89 7:56 EST Received: from SMOKE.BRL.MIL by SMOKE.BRL.MIL id aa12789; 2 Feb 89 7:52 EST Received: from SRI-NIC.ARPA by SMOKE.BRL.MIL id aa12653; 2 Feb 89 7:45 EST Received: from vgr.brl.mil by SRI-NIC.ARPA with TCP; Thu, 2 Feb 89 01:47:18 PST Date: Thu, 2 Feb 89 4:41:04 EST From: Phil Dykstra <phil@BRL.MIL> To: tcp-ip@sri-nic.arpa Subject: Instability in the Core Message-ID: <8902020441.aa16937@VGR.BRL.MIL> Tonight I was trying to talk to some machines on XEROX-NET (net 13), and once again was hit with oscillating Net-Up/Net-Unreachable. This has been happening to me for the past several days for net 13 as well as several other nets (FYI, I'm 26.2.0.29). We have been getting EGP info from the RESTON-DCEC Butterfly (26.21.0.104). I started watching tonight to see why these routes kept appearing and disappearing and found major unrest in the routing information we were getting. Here are nine consecutive EGP routing updates (taken at three minute intervals). They span 0400 EST. Int Ext Routes (~A B C) 5 95 479 6 85 536 5 95 401 6 86 598 17 333 263 6 84 507 15 266 241 5 94 456 8 270 193 6 91 599 16 335 263 4 93 453 8 266 194 6 87 580 17 321 257 The fields are number of internal and external EGP gateways, total number of routes, and the approximate number of class A, B, and C (approx because this includes a few of our fixed routes). I have complete EGP dumps for the last six updates if anyone wishes to study the changes. It really bothers me that the number of class A networks could double/half every three minutes! There is also a 10% to 50% change in the total number of routes every three minutes. One wouldn't expect the number of internal EGP gateways to change so fast either [thought the LSI-11's used to flop like that too]. It is nearly impossible to get data through when the routes come and go this fast. I realize that the Butterfly folks are probably working on this, but I wasn't sure everyone was aware how bad things are right now (I recall one other TCP-IP note about it). Is there anything we can do to help diagnose this? - Phil <phil@brl.mil> uunet!brl!phil ----- End of forwarded messages ================ Script started on Wed Feb 8 18:11:57 1989 PING louie.udel.edu (128.175.1.3): 56 data bytes 64 bytes from 128.175.1.3: icmp_seq=0 time=466 ms ... through ... 64 bytes from 128.175.1.3: icmp_seq=95 time=433 ms 64 bytes from 128.175.1.3: icmp_seq=96 time=981 ms Wed Feb 8 18:14:15 EST 1989 64 bytes from 128.175.1.3: icmp_seq=96 time=1948 ms <<<DUPLICATE! 64 bytes from 128.175.1.3: icmp_seq=97 time=1084 ms 64 bytes from 128.175.1.3: icmp_seq=98 time=451 ms 64 bytes from 128.175.1.3: icmp_seq=99 time=514 ms ... through ... 64 bytes from 128.175.1.3: icmp_seq=175 time=566 ms Wed Feb 8 18:15:36 EST 1989 64 bytes from 128.175.1.3: icmp_seq=176 time=414 ms 64 bytes from 128.175.1.3: icmp_seq=177 time=448 ms 64 bytes from 128.175.1.3: icmp_seq=178 time=414 ms 64 bytes from 128.175.1.3: icmp_seq=179 time=499 ms 64 bytes from 128.175.1.3: icmp_seq=180 time=481 ms 64 bytes from 128.175.1.3: icmp_seq=181 time=481 ms 64 bytes from 128.175.1.3: icmp_seq=182 time=499 ms 64 bytes from 128.175.1.3: icmp_seq=183 time=599 ms 36 bytes from MCLEAN-MB.DDN.MIL (26.20.0.17): Destination Net Unreachable Vr HL TOS Len ID Flg off TTL Pro cks Src Dst Data 4 5 00 0054 a304 0 0000 fb 01 c3e4 c0051708 80af0103 36 bytes from MCLEAN-MB.DDN.MIL (26.20.0.17): Destination Net Unreachable Vr HL TOS Len ID Flg off TTL Pro cks Src Dst Data 4 5 00 0054 a30f 0 0000 fb 01 c3d9 c0051708 80af0103 ... 21 more net unreachables deleted ... egp: default of 26.1.0.49 with 293 routes <<<GATEWAY EGP UPDATE egp: 87 gwys, 6 int, 81 ext (565 routes). ip: 587 routes, 15 A, 306 B, 259 C, 7 S, 0 O. 36 bytes from MCLEAN-MB.DDN.MIL (26.20.0.17): Destination Net Unreachable Vr HL TOS Len ID Flg off TTL Pro cks Src Dst Data 4 5 00 0054 a3a4 0 0000 fb 01 c344 c0051708 80af0103 Wed Feb 8 18:16:09 EST 1989 36 bytes from MCLEAN-MB.DDN.MIL (26.20.0.17): Destination Net Unreachable Vr HL TOS Len ID Flg off TTL Pro cks Src Dst Data 4 5 00 0054 a3aa 0 0000 fb 01 c33e c0051708 80af0103 ... 45 more net unreachables deleted ... milr: msg with link 27 from 4/48 <<<FUNNY AFWL MESSAGES milr: pack len 2352, format 15, illen 28681 <<<"4/48" IS PORT/NODE 36 bytes from MCLEAN-MB.DDN.MIL (26.20.0.17): Destination Net Unreachable Vr HL TOS Len ID Flg off TTL Pro cks Src Dst Data 4 5 00 0054 a4c1 0 0000 fb 01 c227 c0051708 80af0103 36 bytes from MCLEAN-MB.DDN.MIL (26.20.0.17): Destination Net Unreachable Vr HL TOS Len ID Flg off TTL Pro cks Src Dst Data 4 5 00 0054 a4c6 0 0000 fb 01 c222 c0051708 80af0103 Wed Feb 8 18:16:58 EST 1989 ... 42 more net unreachables deleted ... 36 bytes from MCLEAN-MB.DDN.MIL (26.20.0.17): Destination Net Unreachable Vr HL TOS Len ID Flg off TTL Pro cks Src Dst Data 4 5 00 0054 a5f3 0 0000 fb 01 c0f5 c0051708 80af0103 64 bytes from 128.175.1.3: icmp_seq=300 time=633 ms <<< ONLY DROPPED 3PKTS 64 bytes from 128.175.1.3: icmp_seq=301 time=733 ms Wed Feb 8 18:17:47 EST 1989 64 bytes from 128.175.1.3: icmp_seq=302 time=666 ms 64 bytes from 128.175.1.3: icmp_seq=303 time=881 ms 92 bytes from BRL.ARPA (26.2.0.29): Source Quench Vr HL TOS Len ID Flg off TTL Pro cks Src Dst Data 4 5 00 0054 a605 0 0000 fc 01 bfe3 c0051708 80af0103 64 bytes from 128.175.1.3: icmp_seq=304 time=1248 ms 64 bytes from 128.175.1.3: icmp_seq=306 time=633 ms 64 bytes from 128.175.1.3: icmp_seq=307 time=748 ms ... through ... 64 bytes from 128.175.1.3: icmp_seq=331 time=1381 ms Wed Feb 8 18:18:19 EST 1989 64 bytes from 128.175.1.3: icmp_seq=333 time=933 ms milr: msg with link 27 from 4/48 milr: pack len 2352, format 15, illen 28681 64 bytes from 128.175.1.3: icmp_seq=334 time=1281 ms ... through ... 64 bytes from 128.175.1.3: icmp_seq=384 time=784 ms 64 bytes from 128.175.1.3: icmp_seq=386 time=566 ms 64 bytes from 128.175.1.3: icmp_seq=387 time=651 ms ... through ... 64 bytes from 128.175.1.3: icmp_seq=410 time=766 ms Wed Feb 8 18:19:40 EST 1989 64 bytes from 128.175.1.3: icmp_seq=411 time=833 ms 64 bytes from 128.175.1.3: icmp_seq=412 time=633 ms 64 bytes from 128.175.1.3: icmp_seq=413 time=548 ms 64 bytes from 128.175.1.3: icmp_seq=414 time=766 ms 64 bytes from 128.175.1.3: icmp_seq=415 time=848 ms 64 bytes from 128.175.1.3: icmp_seq=416 time=633 ms egp: default of 26.1.0.49 with 232 routes <<< GATEWAY EGP UPDATE egp: 86 gwys, 6 int, 80 ext (434 routes). ip: 456 routes, 14 A, 243 B, 192 C, 7 S, 0 O. 64 bytes from 128.175.1.3: icmp_seq=417 time=848 ms <<< 1 MORE PACKET THEN Wed Feb 8 18:19:56 EST 1989 <<< NOTHING UNTIL Wed Feb 8 18:20:12 EST 1989 Wed Feb 8 18:20:28 EST 1989 milr: msg with link 27 from 4/48 milr: pack len 2352, format 15, illen 28681 Wed Feb 8 18:20:44 EST 1989 Wed Feb 8 18:21:00 EST 1989 milr: incomplete 15/115 3 Wed Feb 8 18:21:16 EST 1989 64 bytes from 128.175.1.3: icmp_seq=514 time=418 ms <<< HERE ... through ... 64 bytes from 128.175.1.3: icmp_seq=534 time=499 ms 92 bytes from BRL.ARPA (26.2.0.29): Source Quench Vr HL TOS Len ID Flg off TTL Pro cks Src Dst Data 4 5 00 0054 a836 0 0000 fc 01 bdb2 c0051708 80af0103 64 bytes from 128.175.1.3: icmp_seq=536 time=514 ms Wed Feb 8 18:21:49 EST 1989 64 bytes from 128.175.1.3: icmp_seq=537 time=533 ms 36 bytes from localhost (127.0.0.1): Destination Port Unreachable Vr HL TOS Len ID Flg off TTL Pro cks Src Dst Data 4 5 00 003d a842 0 0000 1e 11 0000 7f000001 7f000001 UDP: from port 53, to port 3500 (decimal) 64 bytes from 128.175.1.3: icmp_seq=538 time=433 ms 64 bytes from 128.175.1.3: icmp_seq=539 time=448 ms ... through ... 64 bytes from 128.175.1.3: icmp_seq=546 time=448 ms 92 bytes from BRL.ARPA (26.2.0.29): Source Quench Vr HL TOS Len ID Flg off TTL Pro cks Src Dst Data 4 5 00 0054 a882 0 0000 fc 01 bd66 c0051708 80af0103 64 bytes from 128.175.1.3: icmp_seq=547 time=5048 ms <<< UNUSUAL DELAY 64 bytes from 128.175.1.3: icmp_seq=548 time=4181 ms Wed Feb 8 18:22:05 EST 1989 64 bytes from 128.175.1.3: icmp_seq=552 time=2381 ms 64 bytes from 128.175.1.3: icmp_seq=553 time=2266 ms 64 bytes from 128.175.1.3: icmp_seq=554 time=2099 ms 64 bytes from 128.175.1.3: icmp_seq=555 time=1866 ms 64 bytes from 128.175.1.3: icmp_seq=556 time=1448 ms 64 bytes from 128.175.1.3: icmp_seq=557 time=999 ms 64 bytes from 128.175.1.3: icmp_seq=558 time=433 ms 64 bytes from 128.175.1.3: icmp_seq=559 time=533 ms ... through ... Wed Feb 8 18:23:58 EST 1989 64 bytes from 128.175.1.3: icmp_seq=662 time=518 ms 64 bytes from 128.175.1.3: icmp_seq=663 time=518 ms 64 bytes from 128.175.1.3: icmp_seq=664 time=499 ms 64 bytes from 128.175.1.3: icmp_seq=665 time=551 ms 64 bytes from 128.175.1.3: icmp_seq=666 time=451 ms 64 bytes from 128.175.1.3: icmp_seq=667 time=418 ms 64 bytes from 128.175.1.3: icmp_seq=668 time=599 ms 64 bytes from 128.175.1.3: icmp_seq=669 time=466 ms 64 bytes from 128.175.1.3: icmp_seq=670 time=566 ms 64 bytes from 128.175.1.3: icmp_seq=671 time=633 ms 64 bytes from 128.175.1.3: icmp_seq=672 time=433 ms 36 bytes from MCLEAN-MB.DDN.MIL (26.20.0.17): Destination Net Unreachable Vr HL TOS Len ID Flg off TTL Pro cks Src Dst Data 4 5 00 0054 aa58 0 0000 fb 01 bc90 c0051708 80af0103 36 bytes from MCLEAN-MB.DDN.MIL (26.20.0.17): Destination Net Unreachable Vr HL TOS Len ID Flg off TTL Pro cks Src Dst Data 4 5 00 0054 aa64 0 0000 fb 01 bc84 c0051708 80af0103 ... 174 more net unreachables deleted ... 64 bytes from 128.175.1.3: icmp_seq=848 time=666 ms 64 bytes from 128.175.1.3: icmp_seq=849 time=833 ms Wed Feb 8 18:27:14 EST 1989 64 bytes from 128.175.1.3: icmp_seq=850 time=751 ms ... through ... 64 bytes from 128.175.1.3: icmp_seq=878 time=833 ms egp: default of 26.1.0.49 with 318 routes egp: 83 gwys, 6 int, 77 ext (569 routes). ip: 591 routes, 15 A, 311 B, 258 C, 7 S, 0 O. 64 bytes from 128.175.1.3: icmp_seq=879 time=918 ms 64 bytes from 128.175.1.3: icmp_seq=880 time=1051 ms Wed Feb 8 18:27:46 EST 1989 64 bytes from 128.175.1.3: icmp_seq=881 time=818 ms ... through ... 64 bytes from 128.175.1.3: icmp_seq=895 time=551 ms 64 bytes from 128.175.1.3: icmp_seq=896 time=851 ms Wed Feb 8 18:28:02 EST 1989 64 bytes from 128.175.1.3: icmp_seq=897 time=1151 ms 64 bytes from 128.175.1.3: icmp_seq=895 time=3833 ms <<< DUPLICATE 64 bytes from 128.175.1.3: icmp_seq=898 time=799 ms 64 bytes from 128.175.1.3: icmp_seq=899 time=933 ms 64 bytes from 128.175.1.3: icmp_seq=900 time=584 ms ... through ... 64 bytes from 128.175.1.3: icmp_seq=1039 time=566 ms 64 bytes from 128.175.1.3: icmp_seq=1039 time=766 ms <<< DUPLICATE 64 bytes from 128.175.1.3: icmp_seq=1040 time=748 ms ... though ... 64 bytes from 128.175.1.3: icmp_seq=1115 time=748 ms 64 bytes from 128.175.1.3: icmp_seq=1116 time=499 ms Wed Feb 8 18:31:49 EST 1989 64 bytes from 128.175.1.3: icmp_seq=1117 time=599 ms 64 bytes from 128.175.1.3: icmp_seq=1118 time=799 ms 64 bytes from 128.175.1.3: icmp_seq=1119 time=533 ms 64 bytes from 128.175.1.3: icmp_seq=1120 time=699 ms 64 bytes from 128.175.1.3: icmp_seq=1121 time=533 ms 64 bytes from 128.175.1.3: icmp_seq=1121 time=781 ms <<< DUPLICATE 64 bytes from 128.175.1.3: icmp_seq=1122 time=648 ms 64 bytes from 128.175.1.3: icmp_seq=1124 time=981 ms <<< MISSING 1123 64 bytes from 128.175.1.3: icmp_seq=1125 time=799 ms 64 bytes from 128.175.1.3: icmp_seq=1126 time=548 ms 64 bytes from 128.175.1.3: icmp_seq=1127 time=799 ms 64 bytes from 128.175.1.3: icmp_seq=1128 time=614 ms 64 bytes from 128.175.1.3: icmp_seq=1129 time=448 ms 64 bytes from 128.175.1.3: icmp_seq=1130 time=666 ms 64 bytes from 128.175.1.3: icmp_seq=1131 time=681 ms 64 bytes from 128.175.1.3: icmp_seq=1132 time=681 ms Wed Feb 8 18:32:06 EST 1989 64 bytes from 128.175.1.3: icmp_seq=1133 time=614 ms 64 bytes from 128.175.1.3: icmp_seq=1134 time=514 ms egp: default of 26.1.0.49 with 276 routes <<< GATEWAY EGP UPDATE egp: 88 gwys, 6 int, 82 ext (492 routes). ip: 514 routes, 16 A, 267 B, 224 C, 7 S, 0 O. 64 bytes from 128.175.1.3: icmp_seq=1135 time=814 ms 36 bytes from RESTON-DCEC-MB.DDN.MIL (26.21.0.104): Destination Net Unreachable Vr HL TOS Len ID Flg off TTL Pro cks Src Dst Data 4 5 00 0054 b1ae 0 0000 fb 01 b53a c0051708 80af0103 36 bytes from RESTON-DCEC-MB.DDN.MIL (26.21.0.104): Destination Net Unreachable Vr HL TOS Len ID Flg off TTL Pro cks Src Dst Data 4 5 00 0054 b1b3 0 0000 fb 01 b535 c0051708 80af0103 ... 73 more net unreachables deleted ... milr: msg with link 27 from 4/48 <<< ANOTHER LINK MESSAGE milr: pack len 2370, format 15, illen 10 Wed Feb 8 18:33:29 EST 1989 milr: incomplete 3/13 4 <<< THESE SEEM MORE COMMON milr: incomplete 3/13 4 <<< WHEN RESTON IS COMPLAINING milr: incomplete 3/13 3 milr: msg with link 27 from 4/48 milr: pack len 2378, format 15, illen 16394 Wed Feb 8 18:33:45 EST 1989 <<< A BUNCH MORE MISSING Wed Feb 8 18:34:01 EST 1989 milr: msg with link 27 from 4/48 milr: pack len 2352, format 15, illen 28681 milr: msg with link 27 from 4/48 milr: pack len 2378, format 15, illen 16394 milr: msg with link 27 from 4/48 milr: pack len 2378, format 15, illen 16394 milr: msg with link 27 from 4/48 milr: pack len 2370, format 15, illen 10 milr: msg with link 27 from 4/48 milr: pack len 2378, format 15, illen 16394 milr: msg with link 27 from 4/48 milr: pack len 2370, format 15, illen 10 milr: msg with link 27 from 4/48 milr: pack len 2378, format 15, illen 16394 milr: msg with link 27 from 4/48 milr: pack len 2370, format 15, illen 10 Wed Feb 8 18:34:17 EST 1989 milr: msg with link 27 from 4/48 milr: pack len 2370, format 15, illen 10 milr: msg with link 27 from 4/48 milr: pack len 2378, format 15, illen 16394 Wed Feb 8 18:34:34 EST 1989 Wed Feb 8 18:34:50 EST 1989 Wed Feb 8 18:35:06 EST 1989 Wed Feb 8 18:35:22 EST 1989 64 bytes from 128.175.1.3: icmp_seq=1330 time=581 ms 64 bytes from 128.175.1.3: icmp_seq=1331 time=699 ms 64 bytes from 128.175.1.3: icmp_seq=1332 time=614 ms 64 bytes from 128.175.1.3: icmp_seq=1333 time=481 ms 64 bytes from 128.175.1.3: icmp_seq=1334 time=748 ms 64 bytes from 128.175.1.3: icmp_seq=1335 time=448 ms 64 bytes from 128.175.1.3: icmp_seq=1336 time=599 ms 64 bytes from 128.175.1.3: icmp_seq=1337 time=448 ms 64 bytes from 128.175.1.3: icmp_seq=1338 time=499 ms Wed Feb 8 18:35:38 EST 1989 64 bytes from 128.175.1.3: icmp_seq=1339 time=566 ms ... through ... 64 bytes from 128.175.1.3: icmp_seq=1401 time=818 ms Wed Feb 8 18:36:44 EST 1989 egp: default of 26.1.0.49 with 228 routes egp: 88 gwys, 6 int, 82 ext (529 routes). ip: 551 routes, 14 A, 299 B, 231 C, 7 S, 0 O. 64 bytes from 128.175.1.3: icmp_seq=1402 time=848 ms 64 bytes from 128.175.1.3: icmp_seq=1403 time=1199 ms 64 bytes from 128.175.1.3: icmp_seq=1404 time=681 ms 64 bytes from 128.175.1.3: icmp_seq=1404 time=699 ms <<< DUPLICATE 64 bytes from 128.175.1.3: icmp_seq=1404 time=1166 ms <<< DUPLICATE 64 bytes from 128.175.1.3: icmp_seq=1405 time=514 ms 64 bytes from 128.175.1.3: icmp_seq=1406 time=699 ms 64 bytes from 128.175.1.3: icmp_seq=1407 time=766 ms 64 bytes from 128.175.1.3: icmp_seq=1408 time=881 ms 64 bytes from 128.175.1.3: icmp_seq=1409 time=648 ms 64 bytes from 128.175.1.3: icmp_seq=1410 time=648 ms 64 bytes from 128.175.1.3: icmp_seq=1410 time=714 ms <<< DUPLICATE 64 bytes from 128.175.1.3: icmp_seq=1411 time=648 ms 64 bytes from 128.175.1.3: icmp_seq=1412 time=1248 ms 64 bytes from 128.175.1.3: icmp_seq=1413 time=1148 ms 64 bytes from 128.175.1.3: icmp_seq=1414 time=814 ms 64 bytes from 128.175.1.3: icmp_seq=1415 time=933 ms ... through ... 64 bytes from 128.175.1.3: icmp_seq=1493 time=884 ms 64 bytes from 128.175.1.3: icmp_seq=1494 time=866 ms 64 bytes from 128.175.1.3: icmp_seq=1494 time=1833 ms <<< DUPLICATE 64 bytes from 128.175.1.3: icmp_seq=1495 time=1051 ms 64 bytes from 128.175.1.3: icmp_seq=1496 time=799 ms 64 bytes from 128.175.1.3: icmp_seq=1497 time=866 ms Wed Feb 8 18:38:23 EST 1989 64 bytes from 128.175.1.3: icmp_seq=1498 time=618 ms ... through ... 64 bytes from 128.175.1.3: icmp_seq=1749 time=718 ms Wed Feb 8 18:42:44 EST 1989 64 bytes from 128.175.1.3: icmp_seq=1750 time=818 ms 64 bytes from 128.175.1.3: icmp_seq=1751 time=1133 ms 64 bytes from 128.175.1.3: icmp_seq=1752 time=818 ms 64 bytes from 128.175.1.3: icmp_seq=1753 time=1318 ms 36 bytes from MCLEAN-MB.DDN.MIL (26.20.0.17): Destination Net Unreachable Vr HL TOS Len ID Flg off TTL Pro cks Src Dst Data 4 5 00 0054 bba1 0 0000 fb 01 ab47 c0051708 80af0103 36 bytes from MCLEAN-MB.DDN.MIL (26.20.0.17): Destination Net Unreachable Vr HL TOS Len ID Flg off TTL Pro cks Src Dst Data 4 5 00 0054 bba6 0 0000 fb 01 ab42 c0051708 80af0103 ...and the data goes on for roughly three more hours ....
jas@proteon.com (John A. Shriver) (02/25/89)
Well Mark, that indeed sounds like a great way to improve the antisocial behaviour of 4.[23]bsd in the face of ICMP host and net unreachables. I looked at the source (ip_input.c, tcp_subr.c, protosw.h), and it certainly seems that will have the deisred result. The u_char array inetctlerrmap (in ip_input.c) maps the generic error types from protosw.h to error numbers from errno.h. Offsets 8 and 9 are (protosw.h): #define PRC_UNREACH_NET 8 /* no route to network */ #define PRC_UNREACH_HOST 9 /* no route to host */ which are mapped repsectively to (errno.h): #define ENETUNREACH 51 /* Network is unreachable */ #define EHOSTUNREACH 65 /* No route to host */ Entries in that array which are 0 do not cause user errors. Patching 8 & 9 to zero should do this. Here's an example. Be careful to use lower case `w'. # adb -w /vmunix /dev/kmem inetctlerrmap?w0 patches disk inetctlerrmap/w0 patches memory I have not tried this, no guaruntees. It looks like this works in 4.3bsd, Ultrix 2.2, and SunOS 3.5. It still is not the optimal solution, which would be to pass a warning to the user layer, so they could decide what to do. I suspect that is why there is a /* XXX */ at the end of tcp_ctlinput() in tcp_subr.c. (/* XXX */ is Berkeley shorthand for kludge, should be fixed.) There are no comments on the entire subroutine. Any 4.3bsd gurus out there like to verify this?
edb@fai.UUCP (Edward Bunch) (02/25/89)
In article <8902221936.AA10826@ncs.dnd.ca> netcoor@NCS.DND.CA (DRENET) writes: >We have users on network 128.43 who has reported trouble retrieving >files from several hosts in the Internet. > < 425 Can't build data connection: Connection timed out >Can anyone explain this? I would sure like to understand what is going on >to create this situation, then I can try to do something about it. > >Bob Bradford netcoor@ncs.dnd.ca I saw this same problem here on our WAN. The problem was this. We were trying to talk to a ethernet interface that was on the other end of the machine. This is a little difficult to explain. Picture a machine with two interfaces that we wish to contact to ftp something off. When we start the FTP we specify a host address of the interface on the far side. That is packets must pass through the first interface and then through loop-back before arriving at ftpd. When FTP trys to build the data connection the reverse way it fails. ie. Loop-Back --> Other Interface --> Me. I suppose FTPD wasn't smart enough to avoid the loop-back network on the return trip. Solution: Use the interface address on the near side. -------------------------------------------------------------------------------- Edward A. Bunch UUCP: {uunet,amdahl,sun}!fai!edb Fujitsu America, Inc. DOMAIN: edb@fai.com Computer Support and Administation. --------------------------------------------------------------------------------
narten@PURDUE.EDU (Thomas Narten) (02/26/89)
[ Stuff about using adb to xero out errors in inetctlerrmap ] The suggested fix causes 4.3 to ignore ICMP unreachable errors in all cases, something that one probably does not want to do. For instance, I *much* prefer to have telnet attempts abort quickly with a "network unreachable" than with a "connection timed out" some 60 seconds later. On the other hand, once a connection has been established, I'd prefer stray ICMP errors not break a connection. Moreover, nuking ICMP unreachable errors weakens utilities like ping that understand such messages. One of ping's useful features is the printing of ICMP errors it receives. The following patch (perhaps not pretty, but precise) treats ICMP unreachable errors as before, except that they won't break established connections. Thomas *** /tmp/,RCSt1025442 Sat Feb 25 13:46:58 1989 --- /tmp/,RCSt2025442 Sat Feb 25 13:46:59 1989 *************** *** 258,264 **** tcp_notify(inp) register struct inpcb *inp; { ! wakeup((caddr_t) &inp->inp_socket->so_timeo); sorwakeup(inp->inp_socket); sowwakeup(inp->inp_socket); --- 258,271 ---- tcp_notify(inp) register struct inpcb *inp; { ! if (inp->inp_socket->so_state != SS_ISCONNECTING) { ! register int error = inp->inp_socket->so_error; ! if ((error == EHOSTUNREACH) || (error == ENETUNREACH) ! || (error == EHOSTDOWN)) { ! inp->inp_socket->so_error = 0; /* clear error */ ! return; ! } ! } wakeup((caddr_t) &inp->inp_socket->so_timeo); sorwakeup(inp->inp_socket); sowwakeup(inp->inp_socket);
Mills@UDEL.EDU (02/27/89)
Robert, The Fuzzball logs on net 128.4 and various other places near the NSFNET Bluebone are also showing unstable routes, sometimes intermittent ICMP unretchables and other times ICMP time exceededs. The problem has been growing slowly worse over the last few weeks. Seen from here one or more of the Fuzzball time servers drops off the Earth only to replanet a few minutes or hours later. Also, there is a rising tide of ICMP unmentionables coming from distant gateways, rather than nearby EGP gateways on ARPANET, so things are certainly unstable somewhere in space. Finally, the rate of ARPANET error messages, especially to a few gateways, is growing steadily worse. I conclude PSNs for those gateways may be sinking slowly in the muck. This report is certainly much less specific than yours; however, you may find whatever corroboration useful. Dave
Hampton@DOCKMASTER.ARPA ("David R. Hampton") (02/27/89)
Are the hosts that you are having problems with Berkeley 4.2 systems? There is a problem with the Berkeley 4.2 FTP server code. When the 4.2 server opens a data connection, it should performs several internal steps to get the data connection set up, and then announce the port number to the client. In the distributed kernel, it actually announces the data port before it performs the final step of setup. If this final step fails, a 425 error is returned. David Hampton Hampton @ Dockmaster.ARPA
abe@mace.cc.purdue.edu (Vic Abell) (02/27/89)
There is another BSD ftpd problem in the very latest post-worm release that can cause data connection failure. The connection failure occurs when there are two, incoming ftpd calls from the same remote peer. If the two, receiving ftpd processes both try to open a data connection at the same time, one can fail with an EADDRINUSE error. We have fixed this problem locally by adding a retry loop in ftpd.c's getdatasock() function. The released ftpd lacks this loop. It also may be reporting the cause of data connection failures improperly when they result from an error in the bind() call within getdatasock(). There are several function calls between the bind() and the reply() call that can change the value of errno.