david@ms.uky.edu (David Herron -- Resident E-mail Hack) (02/20/88)
Well, a little bit of sleuthing uncovered the fact that we were having a broadcast storm, and didn't even know it. I guess you wouldn't call it anything worse than a light rainshower -- but anyway, it was there. If you remember our configuration, we do have 4.3 hosts along with hosts which have 4.2 derived software. Well, turning on tcpdump and looking for arp gave me an eyeful -- a constant stream of 3b2s/3b1s and a sequent all arping for 128.163.255.255. Do you know how boring it is to add almost the same line to 20-30 /etc/rc files? Anyway, I told 'em all to use .0.0 as the broadcast address and rebooted 'em all. That, at least, cleared up the broadcast storm. However we are still seeing ether errors, but not nearly as badly as before. Something I noticed today is that our most error-full machine is also the one that serves most of the home directories for the people with workstations. Strange coincidence there. Fortunately we have some more uVaxen arriving to use as servers... so this load'll be spread out some. um, one last thing. I mention this in order to find out the truth of the matter and I certainly don't want to make anybody mad ... but in an earlier posting I related a memory from a DEC salesman that Sun ethernet equipment had some sort of problem ... We now have better word on what this problem is. The claim is that Sun ethernet drivers will shove packets out with too small of a time-gap between them. Specifically 1 micro-second, and that the 802.3 spec wants a 10 microsecond gap. How true is this? I vaguely recall reading something along those lines recently -- it seems it was a Sun person being proud that their hardware is able to keep up a sustained rate on input AND output at the max speed allowed by the spec. -- <---- David Herron -- The E-Mail guy <david@ms.uky.edu> <---- or: {rutgers,uunet,cbosgd}!ukma!david, david@UKMA.BITNET <---- <---- It takes more than a good memory to have good memories.
eshop@saturn.ucsc.edu (Jim Warner) (02/21/88)
In article <8403@g.ms.uky.edu> david@ms.uky.edu (David Herron -- Resident E-mail Hack) writes: > >We now have better word on what this problem is. The claim is >that Sun ethernet drivers will shove packets out with too small >of a time-gap between them. Specifically 1 micro-second, and that >the 802.3 spec wants a 10 microsecond gap. How true is this? > The heartbeat test takes place in the interpacket gap. The window for the test is between 4 and 8 microseconds after the Sun finishes each packet. If Suns were to violate the spec and shorten the interpacket gap, the net news would be full of complaints that Suns don't work with IEEE transceivers. This is not the case.
casey@lll-crg.llnl.gov (Casey Leedom) (02/23/88)
In article <8403@g.ms.uky.edu> david@ms.uky.edu (David Herron) writes: >Well, a little bit of sleuthing uncovered the fact that we were >having a broadcast storm, and didn't even know it. ... a constant >stream of 3b2s/3b1s and a sequent all ARPing for 128.163.255.255. I'm curious, I've noticed this behavior with 4.2BSD based networking implementations also. It's my understanding that the gratuitous ARP responses for packets send to the local-network/all-ones-host-part address is an attempt to negotiate trailer encapsulation on a global basis instead of the current 4.3BSD method which does the trailer encapsulation negotiation when an ARP request is received. Am I talking out my hat? For my own and others' edification would someone explain exactly why 4.2BSD based networking responds with gratuitous ARPs for packets addressed to xxx.xxx.255.255, etc.? Thanks in advance. Casey
romkey@kaos.UUCP (John Romkey) (02/24/88)
In article <4097@lll-winken.llnl.gov> casey@lll-crg.llnl.gov.UUCP (Casey Leedom) writes: >For my own and others' edification would someone explain exactly why >4.2BSD based networking responds with gratuitous ARPs for packets >addressed to xxx.xxx.255.255, etc.? Thanks in advance. When 4.2BSD was released there was no defined standard telling how to broadcast IP datagrams. Berkeley followed an informal standard that said to set the host part (the part that's not net and not subnet) of the IP address to all 0's. So you might see 128.127.0.0 as a broadcast IP address from 4.2, and the 4.2 drivers would know not to bother ARP'ing this packet but to just send it to the ethernet broadcast address instead. Now...later on, in an RFC whose number escapes me at the moment, it was specified that a broadcast datagram should have the host part of its address set to all 1's (128.127.255.255). Since that RFC was made a part of the TCP/IP specification, this is the correct way to do things now. And most systems released since then support that properly. But if you have an application that uses broadcast and follows that standard of using all 1's and you backport it to 4.2BSD, then it will tell the 4.2 kernel to send packets to 128.127.255.255 and 4.2 won't know that's the IP broadcast address and will ARP it. That's why 4.2 might ARP 128.127.255.255 on its own. 4.2BSD might ARP in response to correctly formatted broadcast packets because tries very hard to be an IP router, even if it has only one network interface. When it receives a packet that's not for it (and it won't recognize 128.127.255.255 as an IP broadcast that it should process itself) it tries to forward it. The IP routing code says "Yes, this is for my local net", so the kernel then tries to ARP 128.127.255.255... That should only happen if you have a mix of 4.2 machines and later systems which broadcast according to spec on the same ethernet. There's a variable in the kernel which controls IP forwarding and you can use adb to turn it off, but I don't remember the name of the variable. -- - john romkey ...harvard!spdcc!kaos!romkey romkey@kaos.uucp romkey@xx.lcs.mit.edu
casey@lll-crg.llnl.gov (Casey Leedom) (02/24/88)
I got the following reply to my question about why packets addressed to xxx.xxx.255.255 cause broadcast storms from 4.2BSD based networking implementations. Casey ----- Date: Tue, 23 Feb 88 07:25:13 PST From: Jim Warner <eshop%saturn.UCSC.EDU@ucscc.UCSC.EDU> In article <4097@lll-winken.llnl.gov> you write: > For my own and others' edification would someone explain exactly why > 4.2BSD based networking responds with gratuitous ARPs for packets > addressed to xxx.xxx.255.255, etc.? Thanks in advance. These packets were sent as ethernet broadcasts. They were received at the 4.2BSD hosts. When the IP layer opens the packet and looks at the destination address, it sees that the packet is not addressed to this host. It also does not recognize the 255.255 as being the IP broadcast address. It concludes (falsely) that there is a real host at address 255.255 which should have received this misdelivered packet. The host would like to deliver this packet to its proper destination. To do that, the host will need the ethernet address of the destination. An Address Resolution (ARP) request is issued. But there is no host at this IP address and there is no response. The ARP request is therefore repeated by each 4.2BSD machine once for misunderstood ethernet broadcast. Hope that answers your question. Jim Warner
pdb@sei.cmu.edu (Patrick Barron) (02/24/88)
In article <678@kaos.UUCP> romkey@kaos.UUCP (John Romkey) writes: >That should only happen if you have a mix of 4.2 machines and later systems >which broadcast according to spec on the same ethernet. There's a variable >in the kernel which controls IP forwarding and you can use adb to turn >it off, but I don't remember the name of the variable. It's called, oddly enough, "ipforwarding". There's another variable called "ipprintfs" which, if set to 1 while ipforwarding is set to 1, will print a message on the console every time the machine attempts to forward a packet. --Pat.
dudek@ubglue.ksr.com (Glen Dudek) (02/25/88)
In article <678@kaos.UUCP> romkey@kaos.UUCP (John Romkey) writes: > >There's a variable >in the kernel which controls IP forwarding and you can use adb to turn >it off, but I don't remember the name of the variable. > Unfortunately, if I remember my 4.2BSD ip code correctly, turning off "ipforwarding" will cause the host to send an ICMP error to the broadcasting host for each broadcast packet. A complete fix requires patching ip_forward() to free the ip packet and return without sending the ICMP error. I did this on my pre-3.4SunOS Suns when we brought up subnetting at Harvard - you need to patch in a jump at the beginning of ip_forward() to the location in ip_forward() which calls m_freem() and returns. -- Glen Dudek Kendall Square Research Disclaimer: #include <canonical_disclaimer.h>
ron@topaz.rutgers.edu (Ron Natalie) (02/26/88)
THESE AND OTHER PROBLEMS CAN BE SOLVED EASILY! TURN OFF IP FORWARDING ON THINGS THAT ARE NOT GATETWAYS. If machines didn't try to forward apparently misaddressed packets (or trully broken ones misdirected to them), these cycles wouldn't occur. A machine that has one interface that is not performing some gateway function should just consider these packets an error and discard them. Below is a sample ADB which will show you how to turn off ip forwarding on machines that you don't have source for (provided they are 4.2 like). If you have source, set the ipforwarding variable to zero. $ su <-- You need to be root Password: <--- Can't help you here :-) # adb -w -k /vmunix /dev/mem <-- ADB the kernel sbr f0711fc slr 649 <-- Crud output by ADB physmem 1fe _ipforwarding/X <-- Find the current state _ipforwarding: 1 <-- was turned on _ipforwarding/W 0 <-- Not any more! _ipforwarding: 0x0 = 0x0 _ipforwarding?W 1 <-- Fix it for the next reboot. _ipforwarding: 0x0 = 0x0
hans@umd5.umd.edu (Hans Breitenlohner) (02/26/88)
In article <678@kaos.UUCP> romkey@kaos.UUCP (John Romkey) writes: [ he explains why some machines will ARP for addresses x.x.x.255. Then he states: ] > ... There's a variable >in the kernel which controls IP forwarding and you can use adb to turn >it off, but I don't remember the name of the variable. >-- I have no first-hand experience with this, but I have been told that you lose either way. If you turn forwarding off, then you will get ICMP unreachable messages instead of the ARPs.
kre@munnari.oz (Robert Elz) (02/28/88)
> >There's a variable in the kernel which controls IP forwarding > > If you turn forwarding off, then you will get ICMP > unreachable messages instead of the ARPs. If you don't want to, or can't hack your IP code, then .. There's one hack that you can do if you have a host that can publish proxy ARP's .. arrange to have the "bad" IP address (the thing with the trailing 255's) published by some ARP server, with a totally bogus ethernet address for it (anything that doesn't exist on your cable). Hosts that know 255 is broadcast will never arp for it, others will learn the bogus address and forward future packets to that. This doesn't save any ethernet traffic, but keeps it out of the way of all the hosts that neither want to receive a hundred ARP requests nor a hundred ICMP's when 100 old 4.2 hosts geceive a new broadcast. kre