[mod.protocols.tcp-ip] danger of bridges

ddp#@ANDREW.CMU.EDU.UUCP (03/25/87)

I've been hearing alot about people creating large networks using level 2
bridges (i.e. the DEC LANBridge).  People are talking about connecting 1000's
of hosts' to ethernet's connected via them.  In Monterey I even heard about a
3 university consortium planning on using them to connect all their nets
together!  This is extremely dangerous!  It really scares me.  DEC, IBM and
other companies promoting these boxes are being incredibly short sighted and
are leading their customers down a dead-end road!

These boxes are just great for small networks and connecting multiple nets
together where repeaters won't work, but for large net's (greater than 100's
of hosts) they are not efficient.  The reason is because of broadcasts and
multicasts which are passed through the boxes, as they must be.  For example,
ARP request broadcasts are passed through all bridges on the network so that
they reach all hosts on all connected nets.  If you have 1000's of hosts on
your network that tend to talk to a large number of other hosts, you wind up
with an incredible amount of arp traffic.  For example, the CMU network is
composed of >2000 hosts and >50 networks.  Some of these nets are connected
using LANBridges, but most of them are connected via CMU routers (gateways)
which operate on a scheme similar to the extended arp black boxes propsed by
John Postel in RFC 925 (although we had it first :-)).  This scheme
effectively operates as a level 2 bridge system for ARP packets but as a
level 3 gateway for IP packets.  I.e. routing is done via arp, sort of like
as in "promiscuous arp" or the "arp hack".  I say similar because we've put a
lot of additional work into this scheme in order to suppress the number of
arps.  According to our statistics, we do limit a significant amount of arp
to a single network rather than being forwarded through all connected nets.
However, we still have an average rate of 20 arp's per second on all nets in
the system!  Yes, I typed that right, twenty.  And of course every time
someone's program goes crazy you wind up with even higher rates.  Once a
student hacking on a UNIX system wrote a program to send a UDP datagram to
every host in the host table (since only setuid programs can send broadcasts
in 4.2).  It was truly amazing seeing 100 arp's/sec...  That's the price paid
for not having subnet's and level 3 routing with IP.  We are definitely not
going to reach our goal of 7000 hosts this way...

And then there's DECnet.  I won't claim to be a DECnet expert, but from my
observations it appears to me that all Phase IV DECnet hosts connected to an
ethernet transmit HELLO multicast messages every 15 seconds.  These of course
all pass through the bridge or else intra-area routing wouldn't work.  We
have somewhere around 100 DECnet hosts connected to our backbone ethernet
system.  Dividing these two numbers I expect to see about 6 HELLO's a second
on the net.  Using PCIP NETWATCH I indeed measured 5 per second.  Of course,
this is with only 100 hosts.  Doing the same calculation with 1000 hosts one
would see 66 HELLO's/sec.  2000 hosts would yield 133/sec, 4000 hosts would
give 266/sec.  Can you imagine EVERY DECnet machine on a network processing
266 routing packets/sec?  I sure wouldn't want to try to get work done on
such a machine.

To summarize, level 2 bridges are very useful, but you have realize that they
are not the perfect solution.  You have to keep their limitations in mind.
There are very good reasons for having level 3 routing.

Drew

ddp#@ANDREW.CMU.EDU.UUCP (03/31/87)

>I should hope that the situation is not this bad, especially since x3s33

>ES-IS (host-gateway) routing protocol uses a similar HELLO scheme.  Although

>I am also not a DECNET expert (or even novice, for that matter), I'm would

>be EXTREMELY suprized if the HELLO timer was not settable, meaning that

>for crowded networks, it could and I assume normally would be set

>to greater than 15 seconds.

I've asked a local DECnet wizard who tells me that the timer is indeed
settable (almost everything is in DECnet).  Of course the problem is that it
must be set on every host and we don't have direct control over them all...


>Second, host HELLO's typically go to gateways,

>not other hosts, so every host doesn't need to process every host

>HELLO, just gateway HELLO's.  There should be much fewer gateways than

>hosts.

In DECnet atleast, the HELLO is just a multicast and therefore goes to all
hosts  receiving on that multicast address.


Drew

GROSSMAN@SIERRA.STANFORD.EDU.UUCP (04/01/87)

Regarding DECnet HELLOs:

1) You are correct regarding the settability of the DECnet hello timers.
   These (by default) are 15 seconds, but can be set to any value, as
   timer value is sent out over the network for each host.
2) The hellos are sent to a multicast address that is only enabled by
   DECnet 'routers'.  Non-routers (known as 'endnodes') do not receive
   these messages.

Yes, these messages do eat up some Ethernet bandwidth, but intelligent
controllers pitch them if the host isn't interested.  I think the real
issue here is really "how often do I have to wait to acquire the
Ethernet?"  DEC controllers keep track of this information in a counter
known as "Deferred transmits".  My experience with a cable segment with
over 400 DECnet nodes, a bunch of LAT boxes and some TCP/IP traffic is
that this counter was quite low (well under 1% of all the transmits from
the node in question**).

It would be really interesting to find out the values of things like
deferred transmits, single collision transmits, and multi-collision
transmits vs. total transmits for various host and gateway interfaces.

			Stu Grossman

** Actual mileage may vary.
-------

ddp#@ANDREW.CMU.EDU.UUCP (04/01/87)

Well that is certainly good news, but I have a question...  If DECnet
endnodes don't enable listening to Hello's, then how do two endnodes
communicate on a network consisting only of themselves?  Seems like they'd
have to be listening on some sort of routing multicast address.

Drew

GROSSMAN@SIERRA.STANFORD.EDU.UUCP (04/01/87)

Good guesswork!  Endnodes do indeed listen to a special multicast address.

A breif explanation:  When a DECnet endnode wants to send a datagram, it
sends it to a DECnet node known as the 'Designated Router'.  The
designated router is known by the endnode because he periodically emits a
special hello message to say who he is.  Now, theres a little bit more glue
and filler to deal with getting the endnode to use a more direct route if
possible (such as if the dest node is on the same ethernet), and cacheing
of the next hop (ie: best gateway) to get to a specified node.

Just to throw another monkey into the wrenchworks, there is the added
feature that the designated router can move around!  There's a little bit
of protocol and stuff that arbitrates who gets the honor, but theres no
special setup needed to establish the DR.  Oh yeah, just one more thing,
just in case there is NO DR at all, the endnode can still talk to other
systems on the same wire, it just computes the Ethernet address from
the DECnet address, and sends the datagram!

	Stu Grossman
-------

GROSSMAN@SIERRA.STANFORD.EDU.UUCP (04/02/87)

In the absence of a Designated Router, DECnet endnodes (on an Ethernet)
will attempt to communicate by sending to the destination node's Ethernet
address.  This Ethernet address is computed from the node's DECnet host
number.  This is why all DECnet hosts on an Ethernet have funny addresses.

A DECnet Ethernet address looks like AA-00-04-00-XX-YY, where XX-YY is the
DECnet host number with the bytes swapped (don't shoot me, I'm just the
piano player).
			Stu Grossman
-------

Mills@UDEL.EDU.UUCP (04/03/87)

Drew,

The fuzzies, as you know, use broadcast, rather than multicast. There is in
principle no problem using restricted multicast, but then this also apllies
to RIP and others. Do you have a concrete suggestion on how to manage the
multicast group?

Dave