dorl@vms.macc.wisc.edu (Michael Dorl - MACC) (10/19/88)
We have been having a rash of recent problems that would best be explained by the fact that DEC LAN-100 Bridges stop forwarding all 1-s broadcast traffic. Problems include tcp/ip sites that do not see routed routing traffic and tcp/ip sites that can not initiate a session to a machine on the other side of a bridge. I'd be interested in hearing from anyone who has experienced similar problems or who has any insight into the problem. Michael Dorl (608) 262-0466 dorl@vms.macc.wisc.edu dorl@wiscmacc.bitnet
chris@wucfua.wustl.edu (Chris Myers) (10/19/88)
In article <773@dogie.edu> dorl@vms.macc.wisc.edu (Michael Dorl - MACC) writes: >We have been having a rash of recent problems that would best >be explained by the fact that DEC LAN-100 Bridges stop forwarding >all 1-s broadcast traffic. Problems include tcp/ip sites that >do not see routed routing traffic and tcp/ip sites that can not >initiate a session to a machine on the other side of a bridge. > >I'd be interested in hearing from anyone who has experienced >similar problems or who has any insight into the problem. > >Michael Dorl (608) 262-0466 Washington University has experienced this problem in its network several times in the past few months. It seems largely attributable to one of two causes: 1) A LAN Bridge 100 with version 1 ROMs. There is a bug that causes a PERMANENT MANAGEMENT entry to be placed in the forwarding database that blocks the FF-FF-FF-FF-FF-FF address whenever the bridge is initialized. Get the version 2.0 ROM field upgrade kits to fix the problem. 2) There are (apparently) some confused nodes on the network that will send a packet with a source address of FF-FF-FF-FF-FF-FF when they boot. The LAN Bridges learn this source address, of course, and proceed to block most of the broadcast traffic on the network. For some unknown reason DEC decided that the all FF's address should not be expired from the bridging database, unlike all other entries. The only way to fix this is to use the Remote Bridge Management Software (RBMS) from DEC and remove the forwarding entry for FF-FF-FF-FF-FF-FF. We have decided that the best way to deal with the problem is to run a batch job every 15 minutes or so and check a bridge to see if it has learned the all FF's address, and if so it removes it on all working bridges. The are better better solutions (routers, smart bridges) that we are looking at seriously. Chris Myers Software Engineer Washington University Office of the Network Coordinator
davew@gvgpsa.GVG.TEK.COM (David C. White) (10/20/88)
In article <773@dogie.edu> dorl@vms.macc.wisc.edu (Michael Dorl - MACC) writes: >We have been having a rash of recent problems that would best >be explained by the fact that DEC LAN-100 Bridges stop forwarding >all 1-s broadcast traffic. Problems include tcp/ip sites that >do not see routed routing traffic and tcp/ip sites that can not >initiate a session to a machine on the other side of a bridge. This is caused by the bridge seeing a src address of ff-ff-ff-ff-ff-ff on one side or the other of the bridge. Once it sees this src address from then on it will not forward broadcast traffic depending which side of the bridge you see it on. One very easy way to lock up the DEC bridge is to ping the broadcast address for the network. There is a fix on the way from DEC to get around this problem. Pester your service engineer to get the updated firmware for the bridge. It isn't released yet, but he may be able to get it. In the meantime you will be stuck reinitiailizing the bridge(s) whenever this happens. I would also question what the real cause of the problem is, in other words, what device on your network is sending out messages with ff-ff-ff-ff-ff-ff as the source address? Find the culprit(s) and fix them also. -- Dave White Grass Valley Group, Inc. PHONE: +1 916.478.3052 P.O. Box 1114 Grass Valley, CA 95945 davew@gvgpsa.gvg.tek.com
alan@cunixc.columbia.edu (Alan Crosswell) (10/20/88)
This sounds like a similar problem we saw and I posted to this group several months ago and received several replies on. The problem appears to be that a sick host sends a packet whose SOURCE address is the ethernet broadcast address. This of course violates the ethernet spec. The lanbridge doesn't bother checking for this case and merrily enters this source address into its forwarding database. So now what you have is a forwarding entry that says the broadcast address is on one side of the bridge. The result is that broadcasts go thru in one direction but not the other, so ARP, etc. work in one direction but not the other. If you have RBMS, you can confirm this by issuing a command to the bridge along the lines of: SHOW FORW PHYS ADDR FF-FF-FF-FF-FF-FF To clear the forwarding entry, issue: REMOVE FORW PHYS ADDR FF-FF-FF-FF-FF-FF If you don't have RBMS, simply power-cycle the bridge until the next time you get hosed. Here's a VMS batch jub that I run hourly to keep our bridges squeaky clean: $ SET VERIFY $ PURGE/KEEP=10 RBMS_CLEANUP.LOG $ RBMS USE KNOWN BR SHOW FORW PHYS ADDR FF-FF-FF-FF-FF-FF REMOVE FORW PHYS ADDR FF-FF-FF-FF-FF-FF $ SUBMIT/AFTER="+1:00:00"/RESTART RBMS_CLEANUP $ EXIT Pretty gross, Huh? /a
ron@ron.rutgers.edu (Ron Natalie) (10/20/88)
Your lan bridges are too new, but not new enough. Some bogus host sourced a packet from the broadcast addreess. The LANBRIDGE now thinks that the all ones address is local to a segment and doesn't forward it everywhere. Call DEC. The major problem is that ARP stops working. -ROn
cyrus@pprg.unm.edu (Tait Cyrus) (10/20/88)
Dave White asks: >I would also question what the real cause of the problem is, in other >words, what device on your network is sending out messages with >ff-ff-ff-ff-ff-ff as the source address? Find the culprit(s) and >fix them also. We, like everyone else, are being bitten by this DEC LANBridge bug. We have even been able to capture the bogus packet though. Unfortunately, the packet contained ALL 1's (i.e. ff ff ff ff ff ....) so there was no way to tell which "box" sent it. I have, though, some idea as to which "boxes" are sending these packets. In 5 days, 3 of these packets have been seen on our net coming from different places on the net (i.e. different "boxes"). Looking at ruptimes and such, I noticed that a machine had just been rebooted (15 minutes after the bogus packet). A possibility is that this machine sent the bogus packet upon reboot/shutdown/crash. I chalked this up to coincidence. A few days later I saw two more bogus packets (100 seconds apart) from a DIFFERENT part of the net, a net were a "box" was having new software installed. Again I chalked this up to coincidence. Ok, so far nothing "really" out of the ordinary. Well, it turns out that both "boxes" are IBM PC/RT's, one running AIX and the other having BSD 4.3 installed on it. I am NOT saying that the IBM PC/RT (with UB ethernet boards) is causing the problems, it is that I find it fairly interesting that there is such a coincidence. While watching the net, I have walked up to an IBM PC/RT and powered it off allowing it to reboot; no bogus packets seen. I have shut the IBM down (gracefully); again no bugus packets. Has anyone seeing these bogus packets noticed similar circumstances? Could it the IBM PC/RT? Could it the UB ethernet board? Could I be looking at events that are pure coincidence? Am I full of it? :-) Comments/ideas/suggestions/flames/etc ????? --- @__________@ W. Tait Cyrus (505) 277-0806 /| /| University of New Mexico / | / | Dept of ECE - Parallel Processing Research Group @__|_______@ | Albuquerque, New Mexico 87131 | | | | | | hc | | e-mail: | @.......|..@ cyrus@pprg.unm.edu | / | / @/_________@/
goeran@ae.chalmers.se (Goran Bengtson) (10/27/88)
In article <23652@pprg.unm.edu>, cyrus@pprg.unm.edu (Tait Cyrus) writes: > Ok, so far nothing "really" out of the ordinary. Well, it turns out > that both "boxes" are IBM PC/RT's, one running AIX and the other having > BSD 4.3 installed on it. I am NOT saying that the IBM PC/RT (with UB > ethernet boards) is causing the problems, it is that I find it fairly > interesting that there is such a coincidence. > > While watching the net, I have walked up to an IBM PC/RT and powered > it off allowing it to reboot; no bogus packets seen. I have shut the > IBM down (gracefully); again no bugus packets. > > Has anyone seeing these bogus packets noticed similar circumstances? > Could it the IBM PC/RT? Could it the UB ethernet board? Could I be > looking at events that are pure coincidence? Am I full of it? :-) PC/RT with UB boards (at least some models) IS a source. We have confirmed that. The UB board sends a packet from it's internal memory when initilized and/or started. Warm restart may give packet with ANY content (usually part of the last packet seen before shutdown) so you can get ANY source address in that packet. Cold restart usually gives 1's or 0's (dynamic ram...). We think that it IS possible to initialize the board without causing this problem. We have not seen it from PC/RT running AIX, only from PC/RT's running Bsd 4.3 (with or without Andrew File system). ifconfig down/up cause a random packet to be sent! Our temporary fix was to make sure that a KNOWN packet (short, but with legal source and destination address) is transmitted when the board is initiated. -- Goran Bengtson Email: goeran@ae.chalmers.se Dept. of Applied Electronics Chalmers Univ. of Technology Phone: +46 31 721825 (int) S-412 96 Gothenburg 031 721825 (nat) Sweden