[comp.unix.wizards] arp strangeness

ted@nmsu.edu (02/04/89)

This may or may not belong on this group, but it does strongly affect
most unix systems (esp those on the internet).

The basic question is, on a class b network (128.123.x.x), what SHOULD
happen when some host pings 128.123.255.255 (local broadcast).
Obviously, ping may or may not realize that this address is a
broadcast address.  If it does not, then it will originate an arp
request for the broadcast address which is the source of the real rub.
What should host x do when it receives an arp request for broadcast?

As a hint of what perhaps should not happen, in our situation, the
final result is that several Ungermann Bass terminal servers decide
that they are the broadcast address in question and they return an arp
response with their own ethernet address with equated with the ip
broadcast address.  It is very hard to follow exactly what is cause
and what is effect in the amazing storm that succeeds the anomalous
request and so it is hard to determine exactly what is happening and
why. 

hedrick@geneva.rutgers.edu (Charles Hedrick) (02/05/89)

My first reaction to your broadcast storm problem is that all of
your implementations are at fault.  

1) all of your hosts should recognize 128.123.255.255 as a 
broadcast address.  Even hosts that know about a subnet
mask should still recognize the whole-net broadcast address.

2) the problem happens because machines are trying to 
forward the packet.  Only gateways should forward packets.
You should turn off ip_forwarding in all other machines, and
make darn sure your gateways are careful about what they do.

3) machines should never ARP for a broadcast address.  I'd
be inclined to make this a separate test in the arp code,
so that no matter what configuratio errors occured elsewhere,
this can't happen.  The test should check for all known
forms of broadcast, including the old ones with 0's, and
should check both subnet and whole net forms.  Also all 1's
and all 0's.

4) machines should never respond to an ARP request for a
broadcast address. See (3).  

It sounds like you have a real zoo of IP implementations.  What
we do on networks like that is

(1) turn off forwarding on every machine where we can.

(2) pick a broadcast address that everybody knows.  Often this
turns out to be the whole-net broadcast address using 0's
instead of 1's.  It turns out that implementations that
understand subnets normally also recognize the whole-net
addresses, and implementations that understadn 1's normally
also understand 0's.  However we have networks where there
is no broadcast address that every host treats correctly.

(3) Sometimes it helps to set up hosts specifically to respond to ARP
requests for specific broadcast addresses.  On many systesm you can
manually add entries to the ARP table adn flag them so the machine
will respond for them.  e.g. the "pub" option on the "arp" command in
SunOS 4.0 (presumably from 4.3 BSD).  The idea is that you respond to
ARP requests for x.y.255.255 and give them a bogus Ethernet address
that points nowhere.  This will quiet machines that insist on ARPing
for the broadcast address.  However if your U-B junk already responds
with its own address, this probably won't do you any good.

There have been systems with some pretty amazing bugs, like those that
send ICMP messages with source and destination addresses inverted.
There have been combinations of bugs that resulted in ARP entries
which have the Ethernet broadcast address in them.  In one case we've
heard of, the victim site had to send all their employees home and
shut down every machine on the network at once.