[mod.protocols.tcp-ip] Mysterious ARP behavior on a tcp-ip ethernet

mogul@NAVAJO.STANFORD.EDU.UUCP (08/01/86)

[To summarize bjp's problem: he's got hosts sending out ARPs asking for the
 hardware address of IP host 192.12.120.255, and he wants to know why a host
 would ARP for what is supposed to be a broadcast address.]

Your problem is that you have two groups of hosts with different ideas about
what a broadcast address is.  One group follows RFC919 and uses "all ones"
for broadcast; these are the "good hosts".  The other group uses the "all
zeros" address that Berkeley put into 4.2BSD; these are the "bad hosts".
[Note that 4.3BSD is "good".]  "Good hosts" know that there are bad hosts
out there and recognize, but never transmit, all-zeros broadcasts; that is
why they are not trying to ARP the 192.12.120.0 address that the bad hosts
are probably broadcasting to.

So, what you should do is determine which of your hosts are "bad" (any host
that sends an ARP request for 192.12.120.255 is bad; you might have others.)
Then, call up the manufacturers and demand that they fix their code to
conform to RFC919.  If you are running 4.2BSD on a Vax, switch to 4.3BSD.

Whatever you do, NEVER EVER modify ARP code to respond to requests
for what turns out to be a broadcast address.  Ever.  Never, ever.
Not "hardly ever"; never.

bjp@MITRE-BEDFORD.ARPA.UUCP (08/01/86)

	I am looking to see if anyone out there can give me some information
on what might be going on with our network.  We have a 500 meter ethernet
cable hooking together several sun workstations, a pc, a couple of Celerities,
random other machines, an appletek bridge that gets us to a broadband
cable with much else on it.  TCP/IP are the networking protocols used and arp
is used for address translation of IP internet addresses to 48 bit ethernet
addresses.  Some folks noticed bursts of ethernet broadcast messages
recieved by an IBM PC that occured at intervals sometimes 15 seconds,
sometimes 1 minute appart.

	I took a nutcracker and examined the traffic and took samples of the
traffic including bursts of broadcast packets.  I captured 128 octet slices
of each packet in the traffic sample.  I disassembled the hex codes to
identify MAC frame fields and their contents, including the data field where
I found either ip header info, or arp header info.

	Here is what I found.  There were about 30 packets in each burst.
Each was an arp request packet sent by a particular host looking for the
ethernet address for 192.12.120.255 (255 is a reserved assigned number
when in the host field means all hosts on 192.12.120, which is our network,
mitre-b-net).  This looked absurd - arp broadcasting to seek the ethernet
address of what looked to me like an Internet style broadcast address
for our network.  Without fail this burst of arp mischief was preceded
with an ethernet broadcast packet with an ip packet in its data field whose
source address was either one of two guilty hosts and whose destination
address was 192.12.120.255.  One of the hosts is our gateway to the
arpanet, milnet and many other wonderful places in the world.

	The plot thickens.  I examined the translation tables on several hosts
and found the internet address 192.12.120.255 with a big ? where an ethernet
address would have been if arp had a sensible internet address for a specific
target host to work with.

	Does anyone know why IP would do such a thing.  Is this how IP
forwards? If this is legitimate forwarding then why do arps do silly
things with it?

					bj Pease

HEDRICK@RED.RUTGERS.EDU.UUCP (08/01/86)

It is fairly common to get large-scale ARP'ing when there are confusions
on your network about hosts.  In general, I agree with Jeff Mogul.
However this is common enough that I'd like to add a bit of detail.  I
also have some suggestions to allow you to improve things without
getting every vendor to fix their implementation immediately.

First, let's get clear about what is happening.  One of your hosts is
sending a broadcast.  Since your network number is apparently
192.12.120, it is using 192.12.120.255.  This is the new standard: To
make a broadcast address, stick all ones in the host field.  (255 is all
ones.) Unfortunately, you have hosts on your network that believe that a
broadcast address should be made by sticking a zero in the host field,
i.e. 192.12.120.0.  When they see the packet addressed to
192.12.120.255, they think you are trying to address some specific host
with that address.  Since they are not that host, they decide to be nice
and forward the packet to the host.  In order to forward the packet,
they need the Ethernet address of 192.12.120.255.  Thus they issue an
ARP request for it.  The entry in the ARP table marked with ? means that
they have issued an ARP request but not yet gotten a response. 

In my opinion, the simplest solution is to continue using the old
convention on all machines until the new convention has been implemented
everywhere.  As Jeff points out, new implementations are being designed
to accept either.  Most new implementations allow you to set the
broadcast address that they will emit. So my suggestion is that you try
to get up to date implementations as soon as possible, but until you do,
set all the new implementations to use 192.12.120.0 as their broadcast
address.  Once everything is updated, then change over to using the new
convention on all your machines.

There is one more point.  It is really a mistake for normal machines to
forward packets that are intended for other machines.  This is happening
because almost every IP implementation has the potential for acting as a
gateway between two networks.  Such a gateway is expected to forward
packets from one network to the other.  Most of the code simply checks
to see whether the packet is for it, and if not, sends it on to the
destination.  This isn't a terribly clever approach, and real gateways
are generally more careful.  But when your machines isn't acting as a
gateway, forwarding packets at all is a bad idea.  It can gain nothing,
and when confusion occurs, things like you observed happen.  Most
implementations allow you to disable this forwarding.  In 4.2, you
simply set a variable ipforwarding to 0.  This can be done in adb even
if you don't have source.   Even this isn't ideal, because your machine
will send an error message back to the source of the packet telling it
that the packet couldn't be delivered.  We disable forwarding more
drastically. In routine ip_forward, we simply discard the packet.
(There is a test at the beginning of the routine checking whther the
packet is a broadcast.  If so, it is thrown away.  Just make that test
always succeeed.)  But even if you don't do this, turning off
ipforwarding will be a big help.  Every time the system sees a stray
packet, it will send back an error to the sender, rather than issuing an
ARP.  An ARP is a broadcast, so it clogs up everybody. The error will
just clog up the guy who is causing the problem.

Note, by the way, that the Applitek bridge in effect turns the whole
set of Ethernets that it connects into a single logical Ethernet. 
Depending upon how you have allocated Internet addresses, you may hve
to make sure that every host on every Ethernet connected via the
Applitek system is doing things consistently.
-------