[mod.protocols.tcp-ip] Response to anti-bridge comments

karn@FLASH.BELLCORE.COM.UUCP (03/26/87)

As one who has just helped construct a "large" bridged network, I think a
few comments based on actual experience might be useful.

First, a description. Bellcore has five major locations in north central New
Jersey.  We lease T1 circuits organized as a star with the hub at
Piscataway, the geographically central location.  These circuits are divided
down with synchronous multiplexors into various sized channels for things
like Micom terminal switch trunks and IBM RJE links.  At the moment, 256
kbps of this capacity connects a set of five Vitalink Translan IIIs as a
star with the hub at Piscataway.  Each of these boxes also connects to the
building-wide backbone Ethernet at its location, thus bridging the locations
together at the link level. Within each location almost all of our Ethernets
are bridged with DEC LanBridge 100s, with the fiber version interconnecting
multiple buildings at a location.  At last count, the routing tables on the
Translans showed something like 600 active hosts. Virtually all of these
hosts speak DoD IP; most are 4.2BSD derivatives. A few Symbolics machines
speak Chaosnet.  As far as I'm aware we have no DECNET or XNS traffic, other
than that spoken by the Translans and LanBridges themselves.

And it all works, and works well!  I hardly ever look at the boxes anymore.
We had one infant mortality after we installed them in late December: a
power supply died after 24 hours in operation. Vitalink immediately shipped
out a replacement which arrived a day later, and the boxes have all been
solid since.  Two other outages were due to people kicking out Ethernet
cables, but this is a generic Ethernet problem and isn't Vitalink's fault (I
do hope they'll come out with screw-on connectors, though).  With our switch
to bridging, the reliability and availability of intra-company networking
has improved enormously over what it was when we used general purpose UNIX
machines (VAXes and Sun fileservers) as IP gateways.  True, it's a bit
unfair to compare standalone boxes with general purpose systems with disks
that must also do other things.  But there were enough RIP/routed screwups
that once I seriously considered running everything with static routing.
Even now our remaining IP routers get screwed up occasionally and they have
to be restarted.  But at least when this happens it doesn't affect our
intra-company communications, which are most important.  And nobody has to
renumber their hosts when they move from one physical cable to another,
which is an ENORMOUS practical advantage in a place as big as this one.

All this is not to say we haven't had our problems.  I do monitor the ARP
broadcast traffic from time to time.  We generally see 1-2 per second, which
is expected and entirely acceptable. If you see 20 per second, then you've
got something wrong somewhere.  I've found that bursts of ARP requests are
usually caused by hosts who respond inappropriately to UDP broadcasts to
bogus IP addresses. The trigger is generally a Microvax, since Ultrix 1.2
allows you to set the net mask and the broadcast address, thereby allowing
you to get it wrong.  (I just can't wait until Sun also supports subnetting
and broadcast address selection).  Although the problem clearly gets worse
as you build larger bridged networks, YOU CAN'T BLAME IT ON BRIDGING!!!  If
there weren't so many broken TCP/IP implementations out there the problem
wouldn't exist in the first place.  Nevertheless, my usual tactic has been
to place an entry in the appropriate Translan to isolate the offending host
until the user can fix it; this "twit listing" feature is very helpful.

You discover other interesting things when you build a large bridged
network.  For example:

1. It seems that every CCI Power 6 machine as shipped comes with the same
Ethernet address. We didn't notice until we started bridging networks
together, but you can't exactly blame it on the use of bridging.

2. Some older 4.2 machines seem to respond inappropriately with wrong
answers to RARPs from Sun workstations, keeping them from booting.

3. We made an aggressive effort to turn off rwho daemons, bringing UDP
broadcasting to an acceptable level. (Many people find this necessary even
when bridges aren't used). With fewer IP gateways, the amount of RIP traffic
has stayed fairly modest.

4. Pyramids seem to respond to every ARP request they hear, regardless of
whether they were the target or not. Fortunately they respond with correct
(but irrelevant) information, so this is just a minor annoyance.

You can just as easily have antisocial machines with these problems on the
same physical cable; the solution is to FIX them, not throw up your hands
and say "bridges are terrible" because they force you to confront the
software vendors.

Overall, our experience with bridging has been quite positive.  There are
some valid arguments against large-scale bridging, but they have to do more
with vulnerability to spoofing than with any inherent technical weaknesses
in a "friendly" environment such as ours.  Even in a heterogeneous
environment, though, Vitalink boxes are useful as simple, fast packet
switches because they can be configured to filter out broadcasts and to use
static routing tables. I understand that NASA Ames uses them in this way.

I'm a big believer in TCP/IP.  IP does the job of interconnecting dissimilar
networks so well that some people forget that there are easier ways to
connect networks of the same type.  The Internet has grown so large that the
job needs to be broken down hierarchically into more manageable pieces; you
can't (and shouldn't try to) do EVERYTHING with IP gateways.

Phil

kincl%hplnmk@HPLABS.HP.COM.UUCP (03/30/87)

                                                         At the moment, 256
   kbps of this capacity connects a set of five Vitalink Translan IIIs as a
   star with the hub at Piscataway.  Each of these boxes also connects to the
   building-wide backbone Ethernet at its location, thus bridging the locations
   together at the link level. Within each location almost all of our Ethernets
   are bridged with DEC LanBridge 100s, with the fiber version interconnecting
   multiple buildings at a location.  At last count, the routing tables on the
   Translans showed something like 600 active hosts.

Question:  How well would all this work if you decided that you need
redundant routes or if you decided that you do not want a start topology
but have an arbitrary topology?  Can you have a configuration like the
following:

        ---------------Ethernet-----------------
               |                       |
               |                       |
             Bridge                  Bridge
               |                       |
               |                       |
        ---------------Ethernet-----------------
               |                       |
               |                       |
             Bridge                  Bridge
               |                       |
               |                       |
        ---------------Ethernet-----------------


We currently have about 40-50 Ethernets connected with about 20 IP
gateway boxes.  The gateways are connected via Ethernet, Broadband, 56Kb
land lines, 9.6Kb land lines, and soon T1 and 64Kb satelite links.
The gateways we use (cisco Systems) are both fast (we have measured
them at 1000 large packets per second between Ethernets) as well as
have reasonable routing algorithms (though they will speak RIP they
talk amongst themselves with IGRP).  We have not seen any problems with
the routing (or anything else related to the gateways).

You are right---it is unfair to compare standalone level 2 bridges with
general purpose systems (especially when they are runing RIP).

-Norm Kincl
 HP Labs

karn@FLASH.BELLCORE.COM.UUCP (03/30/87)

DEC LanBridge 100's have a loop detection algorithm.  When a bridge is
turned on, it sends out test packets on each interface and checks if they
come back in on the other side. If so, the bridge will remain offline and
not forward traffic.  An offline bridge will continue to test the path and
will go online within a few seconds should the existing path fail.

It is my understanding that while the Vitalink Translans have a similar loop
detection algorithm, not every combination of DEC and Vitalink bridges will
do what you want. If there are parallel paths via LanBridges and Translans,
it's possible to have a LanBridge go offline in favor of a Translan, which
is decidedly suboptimal.  I understand Vitalink is working on a new software
release that will "do the right things" in combination with LanBridge 100s.
Fortunately the situation is rather rare (it doesn't occur in our network).
For further info you should call Vitalink.

Phil

ddp#@ANDREW.CMU.EDU.UUCP (03/31/87)

 >At last count, the routing tables on the Translans showed something like
600 active hosts. 

I'm not sure this is quite a large enough net for my statements to apply.
It's probably the case that your's is a network where level 2 bridges are
appropriate for your current needs.  I wouldn't agree that they will be
appropriate for future needs though.


>We generally see 1-2 per second, which is expected and entirely acceptable.

The arp rate a particular net sees is probably going to be quite dependent on
the type of applications being used on it.  Over 60% of the IP traffic on our
network is from the Andrew distributed file system (something like NFS in
functionality).  Each workstation on the network regularly communicates with
quite a few different servers and other hosts.  I would suspect that the ARP
rate from distributed applications like this is much higher from networks
with your basic telnets, ftp's and smtps.


>If you see 20 per second, then you've got something wrong somewhere.

>I've found that bursts of ARP requests are usually caused by hosts who

>respond inappropriately to UDP broadcasts to bogus IP addresses.

Luckily I think we have the bad UDP broadcast problem under control.  That
definitely is not what causes the majority of our ARP's.  There's nothing
really wrong (that we've been able to find) besides the problems with 4.2
UNIX itself.  Actually most of our arps seem to be from hosts that
relentlessly try to open connections to machines which are down.  It
especially get's bad when an important server goes down.  When 500 hosts are
trying to talk to a file server that has been down for a while...  Luckily
the file system knows about exponential backoff.


One of the problems we found a while ago is that the default size of an arp
cache in 4.2 is not at all appropriate for a server machine which
communicates with LOTS of machines.  The default is for a cache of 5x19
entries, i.e. there are 19 possible hash values and only 5 hosts can hash to
each one.  In the worst case, if you are trying to communicate sequentially
with 6 hosts which all hash to the same value, you can end up sending one ARP
req packet per IP packet/transaction.  For our file servers which regularly
communicate with 300 hosts in a 10 minute period this is definitely not
appropriate...  I think we made the cache 10x99 instead.


The crux of the problem though is that with level 2 biridges, your arp rate
(or any kind of multicast/broadcast) rises LINEARLY with the number of hosts
connected on ANY subnet of the network.  However using level 3 routers, the
rate only rises linearly with the number of hosts connected to your
particular subnet.


Drew

jas@MONK.PROTEON.COM.UUCP (03/31/87)

Phil, your message does not exactly answer Norman's question.  Most
(not all) bridge vendors have loop detection algorithms that allow you
to have hot-standby bridges.  However, they are only hot-standby
bridges.  The very nature of bridges prevents you from doing load
sharing (unless you manually program the filtering/forwarding
databases, giving up the advantages of adaptive bridges).  Gateways
with multipath internal routing algorithms can do load sharing, look
at BBN's PSN software using SPF on the ARPANET and MILNET.  Sooner or
later, a network gets big enough and complicated enough that one
migrates from bridges to routers.

What the advocates are arguing about is where the line between bridges
and routers is.

bill@NRL-LCP.ARPA.UUCP (04/02/87)

> Date: Tue, 31 Mar 87 15:33:29 EST
> From: jas@monk.proteon.com (John A. Shriver)
> Subject: Response to anti-bridge comments

>           The very nature of bridges prevents you from doing load
> sharing (unless you manually program the filtering/forwarding
> databases, giving up the advantages of adaptive bridges).

I don't think it's anything intrinsic about a Bridge that would prevent
you from doing load sharing.  It's only a function of the current
Bridge software that this isn't currently possible.  One simple scheme
that I just happened to think of off the top of my head would involve
having 2**N Bridges numbered from 0 to 2**N-1, and each Bridge would
handle those Ethernet addresses which modulo 2**N matched its own
assigned Bridge number.  A hot spare could detect the failure of any
particular live Bridge, and take over for it.  One Bridge could be
designated to handle broadcast packets.  Does anyone know if such a
scheme or anything similar has been considered for doing load sharing
across Bridges?  It doesn't seem that it would be that difficult to
implement.

By the way, in practice, at least at our site, the level of traffic
through our Ethernet Bridges is nowhere near significant enough to
even start to worry about load sharing, and we have a fairly extensive
Ethernet based network in active use here at NRL.

>                                                            Gateways
> with multipath internal routing algorithms can do load sharing, look
> at BBN's PSN software using SPF on the ARPANET and MILNET.  Sooner or
> later, a network gets big enough and complicated enough that one
> migrates from bridges to routers.

Although Gateways/Routers COULD also do load sharing with appropriate
routing algorithms, it is my understanding that all of the current
Gateways, even those with SPF routing, DO NOT currently do any kind
of load sharing, and only understand and compute a single path to any
given destination.  I specifically asked this question at the TCP/IP
Interoperability Conference held just recently in Monterey, and that
was the answer I was given by someone involved with the SPF routing
Gateways.

> What the advocates are arguing about is where the line between bridges
> and routers is.

I think there are appropriate circumstances to use both Bridges and
Gateways/Routers.  The primary benefits that I see for using Bridges,
given the state of current technology, is ease of installation,
monitoring, and management, and the fact that they are protocol
transparent to the higher layer protocols.  In the future, they
will probably even be used to connect dissimilar technologies, such
as Ethernet and FDDI Fiber Optic networks.  But Gateways certainly
have their appropriate niche also, such as providing a higher degree
of isolation between the connected networks when this is desirable.

						-My first plunge

						-Bill Fink
						 bill@NRL-LCP
------

kik@CERNVAX.BITNET.UUCP (04/06/87)

   I get the impression that, in the discussion  on  Bridge  functionality,
people  are  not taking enough trouble to distinguish between the *service*
and the *communications*.

   The service is to filter and  forward,  based  on  destination  address.
There  is  no reason for the filtering not to be shared, in a disjoint way,
by any number of devices.  In fact, we intend to do this.  The load-sharing
and backup algorithms are designed and we plan to implement them on our own
equipment once we get the time.

   The communications can be based  on  point-to-point  links,  satellites,
etc.  In  fact,  the  choice  of  communications  is  limited  only by your
networking ability.  For  example,  DEC  and  Vitalink  have  "invented"  a
spanning-tree  approach  to provide redundant communcation.  This is really
primitive.  We, at CERN, use a genuine communications subnet  with  all  of
the  build-in  advantages  (originally designed for inter-computer traffic,
and all the better for that!).  The ideal backbone for such purposes  would
be a high-speed bus-type network (we're hoping for FDDI).

    I am trying to rewrite the IEEE 802.1 document on MAC-level bridges  to
reflect the clear separation between
   a) address resolution and
   b) routing
so that the current confusion can be avoided.

          Cheers

               Crispin PINEY