[comp.protocols.tcp-ip] intro to tcp admin, part 2 of 3

hedrick@aramis.rutgers.edu (Charles Hedrick) (07/25/88)
down, and go back to the default gateway.  A similar approach can also
be used to handle failures in the default gateway.  If you  have  mark
two  gateways  as  default,  then  the  software  should be capable of
switching  when  connections  using  one  of   them   start   failing.
Unfortunately,  some  common TCP/IP implementations do not mark routes
as down and change to new ones.  (In particular Berkeley 4.2 Unix does
not.)    However  Berkeley 4.3 Unix does do this, and as other vendors
begin to base products  on  4.3  rather  than  4.2,  this  ability  is
expected to be more common.



4.4 Other ways for hosts to find routes


As  long  as  your  TCP/IP  implementations handle failing connections
properly, establishing one or more default routes in the configuration
file  is  likely  to  be  the simplest way to handle routing.  However
there are two other routing approaches that are worth considering  for
special situations:

   - spying on the routing protocol

   - using proxy ARP



4.4.1 Spying on Routing


Gateways  generally  have  a  special  protocol  that  they  use among
themselves.    Note  that  redirects  cannot  be  used  by   gateways.
Redirects  are  simply ways for gateways to tell "dumb" hosts to use a
different gateway.  The  gateways  themselves  must  have  a  complete
picture of the network, and a way to compute the optimal route to each
subnet.    Generally  they  maintain  this   picture   by   exchanging
information  among  themselves.    There are several different routing
protocols in use for this purpose.  One way for  a  computer  to  keep
track  of  gateways  is  for  it  to listen to the gateways' messages.
There is software available for this purpose for most  of  the  common
routing  protocols.    When  you  run  this  software,  it maintains a
complete picture of the  network,  just  as  the  gateways  do.    The
software  is  generally  designed  to maintain your computer's routing
tables dynamically, so that datagrams are always sent  to  the  proper
gateway.  In effect, the routing software issues the equivalent of the
Unix "route add" and "route delete" commands as the  network  topology
changes.    Generally this results in a complete routing table, rather
than one that depends upon default routes.   (This  assumes  that  the
gateways  themselves  maintain  a  complete table.  Sometimes gateways
keep track of your campus network completely, but use a default  route
for all off-campus networks, etc.)
                                  16



Running  routing  software on each host does in some sense "solve" the
routing problem.  However there are several reasons why  this  is  not
normally  recommended  except  as  a  last  resort.   The most serious
problem is that this reintroduces configuration options that  must  be
kept  up to date on each host.  Any computer that wants to participate
in the protocol among the gateways will need to configure its software
compatibly   with   the   gateways.      Modern  gateways  often  have
configuration options that are  complex  compared  with  those  of  an
individual host.  It is undesirable to spread these to every host.

There  is  a  somewhat  more  specialized problem that applies only to
diskless computers.  By its very nature, a diskless  computer  depends
upon the network and file servers to load programs and to do swapping.
It is dangerous for  diskless  computers  to  run  any  software  that
listens  to  network  broadcasts.   Routing software generally depends
upon broadcasts.  For example,  each  gateway  on  the  network  might
broadcast  its  routing  tables  every  30  seconds.  The problem with
diskless nodes is that the software to listen to these broadcasts must
be loaded over the network.  On a busy computer, programs that are not
used for a few seconds will be swapped or paged out.   When  they  are
activated  again,  they  must  be  swapped  or  paged  in.  Whenever a
broadcast is sent, every computer on the network needs to activate the
routing  software  in order to process the broadcast.  This means that
many diskless computers will be doing swapping or paging at  the  same
time.    This  is likely to cause a temporary overload of the network.
Thus it is very unwise for diskless machines to run any software  that
requires them to listen to broadcasts.



4.4.2 Proxy ARP


Proxy  ARP  is  an alternative technique for letting gateways make all
the routing decisions.  It is applicable to any broadcast network that
uses  ARP  or  a similar technique for mapping Internet addresses into
network-specific  addresses  such  as  Ethernet   addresses.      This
presentation  will  assume  Ethernet.    Other  network  types  can be
acccomodated if you replace "Ethernet address"  with  the  appropriate
network-specific  address,  and ARP with the protocol used for address
mapping by that network type.

In many ways proxy ARP it is similar to  using  a  default  route  and
redirects, however it uses a different mechanism to communicate routes
to the host.  With redirects, a full routing table is used.    At  any
given moment, the host knows what gateways it is routing datagrams to.
With proxy ARP, you dispense with  explicit  routing  tables,  and  do
everything  at the level of Ethernet addresses.  Proxy ARP can be used
for all destinations, only for destinations within your network, or in
various  combinations.   It will be simplest to explain it as used for
all addresses.  To do this, you instruct  the  host  to  pretend  that
every  computer  in  the  world  is  attached  directly  to your local
Ethernet.  On Unix, this would be done using a command

      route add default 128.6.4.2 0
                                  17



where 128.6.4.2 is assumed to be the Internet address  of  your  host.
As  explained  above,  the  metric of 0 causes everything that matches
this route to be sent directly on the local Ethernet.

When a datagram is to be sent to a local  Ethernet  destination,  your
computer  needs  to  know the Ethernet address of the destination.  In
order to find that, it uses something generally called the ARP  table.
This  is  simply  a mapping from Internet address to Ethernet address.
Here's a typical ARP table.  (On our system, it is displayed using the
command "arp -a".)

    FOKKER.RUTGERS.EDU (128.6.5.16) at 8:0:20:0:8:22 temporary
    CROSBY.RUTGERS.EDU (128.6.5.48) at 2:60:8c:49:50:63 temporary
    CAIP.RUTGERS.EDU (128.6.4.16) at 8:0:8b:0:1:6f temporary
    DUDE.RUTGERS.EDU (128.6.20.16) at 2:7:1:0:eb:cd temporary
    W20NS.MIT.EDU (18.70.0.160) at 2:7:1:0:eb:cd temporary
    OBERON.USC.EDU (128.125.1.1) at 2:7:1:2:18:ee temporary
    gatech.edu (128.61.1.1) at 2:7:1:0:eb:cd temporary
    DARTAGNAN.RUTGERS.EDU (128.6.5.65) at 8:0:20:0:15:a9 temporary

Note  that  it  is  simply  a  list  of  Internet  addresses  and  the
corresponding Ethernet address.  The "temporary"  indicates  that  the
entry  was added dynamically using ARP, rather than being put into the
table manually.

If there is an entry for the address in the ARP table, the datagram is
simply  put  on  the Ethernet with the corresponding Ethernet address.
If not, an "ARP request" is broadcast, asking for the destination host
to  identify  itself.   This request is in effect a question "will the
host with Internet  address  128.6.4.194  please  tell  me  what  your
Ethernet address is?".  When a response comes back, it is added to the
ARP table, and future datagrams  for  that  destination  can  be  sent
without delay.

This  mechanism  was  originally  designed  only  for  use  with hosts
attached directly to a single Ethernet.  If you need to talk to a host
on  a different Ethernet, it was assumed that your routing table would
direct you to a gateway.    The  gateway  would  of  course  have  one
interface  on  your Ethernet.  Your computer would then end up looking
up the address of that gateway using  ARP.    It  would  generally  be
useless  to  expect  ARP to work directly with a computer on a distant
network.  Since it isn't on the same  Ethernet,  there's  no  Ethernet
address you can use to send datagrams to it.  And when you send an ARP
request for it, there's nobody to answer the request.

Proxy ARP is based on the  concept  that  the  gateways  will  act  as
proxies  for  distant  hosts.    Suppose  you  have  a host on network
128.6.5, with address 128.6.5.2.  (computer A  in  diagram  below)  It
wants  to send a datagram to host 128.6.4.194, which is on a different
Ethernet (subnet 128.6.4). (computer C in diagram below)  There  is  a
gateway  connecting  the  two subnets, with address 128.6.5.1 (gateway
R):



                                  18



              network 1               network 2
               128.6.5                 128.6.4
        ============================  ==================
          |              |        |    |      |    |
       ___|______   _____|____  __|____|__  __|____|____
       128.6.5.2    128.6.5.3   128.6.5.1   128.6.4.194
                                128.6.4.1
       __________   __________  __________  ____________
       computer A   computer B   gateway R   computer C


Now suppose computer A sends an ARP request for computer  C.  C  isn't
able  to  answer  for  itself.  It's on a different network, and never
even sees the ARP request.  However gateway R can act on  its  behalf.
In  effect,  your  computer  asks "will the host with Internet address
128.6.4.194 please tell me what your Ethernet address  is?",  and  the
gateway   says  "here  I  am,  128.6.4.194  is  2:7:1:0:eb:cd",  where
2:7:1:0:eb:cd is actually the Ethernet address of the gateway.    This
bit  of  illusion  works  just  fine.    Your  host  now  thinks  that
128.6.4.194  is  attached  to  the   local   Ethernet   with   address
2:7:1:0:eb:cd.    Of  course it isn't.  But it works anyway.  Whenever
there's a datagram to be sent to 128.6.4.194, your host  sends  it  to
the specified Ethernet address.  Since that's the address of a gateway
R, the  gateway  gets  the  packet.    It  then  forwards  it  to  the
destination.

Note that the net effect is exactly the same as having an entry in the
routing table saying  to  route  destination  128.6.4.194  to  gateway
128.6.5.1:

    128.6.4.194          128.6.5.1           UGH          pe0

except  that  instead  of  having the routing done at the level of the
routing table, it is done at the level of the ARP table.

Generally it's better to use the routing  table.    That's  what  it's
there for.  However here are some cases where proxy ARP makes sense:

   - when you have a host that does not implement subnets

   - when you have a host that does not respond properly to redirects

   - when you do not want to have to choose a specific default gateway

   - when your software is unable to recover from a failed route

The  technique  was first designed to handle hosts that do not support
subnets.  Suppose that you have a subnetted network.  For example, you
have  chosen  to break network 128.6 into subnets, so that 128.6.4 and
128.6.5 are separate.  Suppose you  have  a  computer  that  does  not
understand  subnets.    It  will  assume that all of 128.6 is a single
network.  Thus it will be difficult to establish routing table entries
to  handle  the  configuration  above.    You  can't tell it about the
gateway explicitly using "route add 128.6.4.0 128.6.5.1  1"  Since  it
thinks  all of 128.6 is a single network, it can't understand that you
                                  19



are trying to tell it where to send  one  subnet.    It  will  instead
interpret  this command as an attempt to set up a host route to a host
who address is 128.6.4.0.  The only thing that would work would be  to
establish  explicit  host  routes  for  every individual host on every
other subnet.  You can't depend upon default gateways and redirects in
this  situation either.  Suppose you said "route add default 128.6.5.1
1".  This would establish the gateway 128.6.5.1 as a default.  However
the  system wouldn't use it to send packets to other subnets.  Suppose
the host is 128.6.5.2, and wants to send a  datagram  to  128.6.4.194.
Since  the destination is part of 128.6, your computer considers it to
be on the same network as itself, and doesn't bother  to  look  for  a
gateway.

Proxy  ARP  solves  this  problem by making the world look the way the
defective implementation expects it to look.  Since  the  host  thinks
all  other  subnets  are part of its own network, it will simply issue
ARP requests for them.  It expects to get  back  an  Ethernet  address
that  can  be used to establish direct communications.  If the gateway
is practicing proxy ARP, it will respond with the  gateway's  Ethernet
address.    Thus  datagrams  are  sent  to the gateway, and everything
works.

As you can see, no specific configuration is need  to  use  proxy  ARP
with a host that doesn't understand subnets.  All you need is for your
gateways to implement proxy ARP.    In  order  to  use  it  for  other
purposes, you must explicitly set up the routing table to cause ARP to
be used.  By default, TCP/IP implementations will  expect  to  find  a
gateway  for any destination that is on a different network.  In order
to make them issue ARP's, you must explicitly  install  a  route  with
metric 0, as in the example "route add default 128.6.5.2 0".

It  is  obvious  that  proxy ARP is reasonable in situations where you
have hosts that don't understand subnets.  Some comments may be needed
on  the  other situations.  Generally TCP/IP implementations do handle
ICMP redirects properly.  Thus it is normally practical to  set  up  a
default  route  to  some gateway, and depend upon the gateway to issue
redirects for  destinations  that  should  use  a  different  gateway.
However in case you ever run into an implementation that does not obey
redirects, or cannot be configured to have a default gateway, you  may
be  able  to  make things work by depending upon proxy ARP.  Of course
this requires that you be able to configure the host  to  issue  ARP's
for  all  destinations.    You  will  need  to  read the documentation
carefully to see exactly what  routing  features  your  implementation
has.

Sometimes  you  may  choose  to depend upon proxy ARP for convenience.
The problem with routing tables is that you have  to  configure  them.
The simplest configuration is simply to establish a default route, but
even there you have to supply some  equivalent  to  the  Unix  command
"route  add  default  ...".    Should you change the addresses of your
gateways, you have to modify this command on all  of  your  hosts,  so
that  they  point to the new default gateway.  If you set up a default
route that depends upon proxy ARP (i.e. has metric 0), you won't  have
to  change  your configuration files when gateways change.  With proxy
ARP, no gateway addresses are  given  explicitly.    Any  gateway  can
                                  20



respond to the ARP request, no matter what its address.

In  order  to  save  you  from having to do configuration, some TCP/IP
implementations default to using ARP when they have  no  other  route.
The  most  flexible implementations allow you to mix strategies.  That
is, if you have specified a route  for  a  particular  network,  or  a
default route, they will use that route.  But if there is no route for
a destination, they will treat it as local, and issue an ARP  request.
As  long as your gateways support proxy ARP, this allows such hosts to
reach any destination without any need for routing tables.

Finally, you may choose to use proxy ARP because  it  provides  better
recovery  from  failure.  This choice is very much dependent upon your
implementation.  The next section will discuss the tradeoffs  in  more
detail.

In  situations  where  there  are  several  gateways  attached to your
network, you may wonder how proxy ARP allows you to  choose  the  best
one.    As  described  above,  your  computer simply sends a broadcast
asking for the Ethernet address for a destination.   We  assumed  that
the  gateways  would be set up to respond to this broadcast.  If there
is more than one  gateway,  this  requires  coordination  among  them.
Ideally,  the  gateways  will  have  a complete picture of the network
topology.  Thus they are able to determine the best  route  from  your
host  to any destination.  If the gateway coordinate among themselves,
it should be possible for the best gateway  to  respond  to  your  ARP
request.    In  practice,  it  may  not always be possible for this to
happen.  It is fairly easy to design algorithms to  prevent  very  bad
routes.  For example, consider the following situation:

          1             2            3
        -------  A  ----------  B ----------

1,  2, and 3 are networks.  A and B are gateways, connecting network 2
to 1 or 3.  If a host on network 2 wants to talk to a host on  network
1,  it  is  fairly  easy  for  gateway  A to decide to answer, and for
gateway B to decide not to.  Here's  how:  if  gateway  B  accepted  a
datagram  for  network 1, it would have to forward it to gateway A for
delivery.  This would mean that it would take a packet from network  2
and  send it right back out on network 2.  It is very easy to test for
routes that involve this sort of circularity.  It is  much  harder  to
deal with a situation such as the following:

                         1
                  ---------------
                    A        B
                    |        | 4
                    |        |
                  3 |        C
                    |        |
                    |        | 5
                    D        E
                  ---------------
                         2

                                  21



Suppose  a  computer  on  network 1 wants to send a datagram to one on
network 2.  The route via A and D is probably better, because it  goes
through  only one intermediate network (3).  It is also possible to go
via B, C, and E, but that path  is  probably  slightly  slower.    Now
suppose  the  computer  on  network  1  sends  an  ARP  request  for a
destination on 2.  It is likely that A and B will both respond to that
request.    B  is not quite as good a route as A. However it is not so
bad as the case above.  B won't have to send the datagram  right  back
out  onto  network  1.    It  is unable to determine there is a better
alternative  route  without  doing  a  significant  amount  of  global
analysis  on  the network.  This may not be practical in the amount of
time available to process an ARP request.



4.4.3 Moving to New Routes After Failures


In principle, TCP/IP routing is capable of handling line failures  and
gateway  crashes.    There  are  various  mechanisms to adjust routing
tables and ARP tables to keep them up to date.    Unfortunately,  many
major  implementations  of  TCP/IP  have  not implemented all of these
mechanisms.  The net result is that you have to look carefully at  the
documentation  for  your  implementation,  and  consider what kinds of
failures are most likely.  You then have to  choose  a  strategy  that
will  work  best  for your site.  The basic choices for finding routes
have all been listed above:  spying on the gateways' routing protocol,
setting  up  a  default  route and depending upon redirects, and using
proxy ARP.  These methods all have their own  limitations  in  dealing
with a changing network.

Spying on the gateways' routing protocol is theoretically the cleanest
solution.  Assuming that the gateways use good routing technology, the
tables  that  they  broadcast  contain  enough information to maintain
optimal routes to all destinations.  Should something in  the  network
change  (a  line  or  a  gateway  goes down), this information will be
reflected in the tables, and the routing  software  will  be  able  to
update the hosts' routing tables appropriately.  The disadvantages are
entirely practical.  However in some situations the robustness of this
approach may outweight the disadvantages.  To summarize the discussion
above, the disadvantages are:

   - If  the  gateways  are  using  sophisticated  routing  protocols,
     configuration may be fairly complex.  Thus you will be faced with
     setting up and maintaining configuration files on every host.

   - Some gateways use proprietary routing protocols.  In  this  case,
     you  may  not  be  able  to  find  software  for  your hosts that
     understands them.

   - If your hosts are diskless, there can be very serious performance
     problems associated with listening to routing broadcasts.

Some  gateways  may  be  able  to  convert from their internal routing
protocol to a simpler one for use by your hosts.  This  could  largely
                                  22



bypass  the  first two disadvantages.  Currently there is no known way
to get around the third one.

The problems with default routes/redirects  and  with  proxy  ARP  are
similar:  they  both  have trouble dealing with situations where their
table entries no longer apply.   The  only  real  difference  is  that
different  tables  are involved.  Suppose a gateway goes down.  If any
of your current routes are using that gateway, you may be in  trouble.
If  you  are depending upon the routing table, the major mechanism for
adjusting routes is the redirect.  This works fine in two  situations:

   - where  the  default  gateway  is not the best route.  The default
     gateway can direct you to a better gateway

   - where a distant line or gateway fails.  If this changes the  best
     route,  the  current gateway can redirect you to the gateway that
     is now best

The case it does not protect you against is where the gateway that you
are currently sending your datagrams to crashes.  Since it is down, it
is unable to redirect you to another gateway.  In many cases, you  are
also  unprotected  if  your  default  gateway  goes  down, since there
routing starts by sending to the default gateway.

The situation with proxy ARP is similar.  If the  gateways  coordinate
themselves  properly,  the  right  one  will  respond  initially.   If
something elsewhere in  the  network  changes,  the  gateway  you  are
currently  issuing  can  issue  a  redirect  to  a new gateway that is
better.  (It is usually possible to use redirects to  override  routes
established  by  proxy  ARP.)    Again, the case you are not protected
against is where the gateway you are currently using crashes.    There
is  no  equivalent  to failure of a default gateway, since any gateway
can respond to the ARP request.

So the big problem is that failure of a gateway you are using is  hard
to  recover  from.   It's hard because the main mechanism for changing
routes is the redirect,  and  a  gateway  that  is  down  can't  issue
redirects.    Ideally,  this  problem should be handled by your TCP/IP
implementation, using timeouts.  If a computer stops getting response,
it  should  cancel the existing route, and try to establish a new one.
Where you are using a  default  route,  this  means  that  the  TCP/IP
implementation  must  be  able  to  declare a route as down based on a
timeout.  If you have been redirected to a  non-default  gateway,  and
that  route is declared down, traffic will return to the default.  The
default gateway can then begin handling the traffic, or redirect it to
a  different  gateway.    To  handle  failure of a default gateway, it
should be possible to have more than one default.  If one is  declared
down,  another  will  be used.  Together, these mechanisms should take
care of any failure.

Similar mechanisms can be used by systems that depend upon proxy  ARP.
If a connection is timing out, the ARP table entry that it uses should
be cleared.  This will cause a new ARP request, which can  be  handled
by a gateway that is still up.  A simpler mechanism would simply be to
time out all ARP entries after some period.  Since making  a  new  ARP
                                  23



request  has  a very low overhead, there's no problem with removing an
ARP entry even if it is still good.  The next time a datagram is to be
sent,  a  new  request  will  be  made.  The response is normally fast
enough that users will not even notice the delay.

Unfortunately,  many  common  implementations   do   not   use   these
strategies.  In Berkeley 4.2, there is no automatic way of getting rid
of any kind of entry, either routing or ARP.  They do  not  invalidate
routes  on  timeout  nor  ARP  entries.  ARP entries last forever.  If
gateway crashes are a significant problem, there may be no choice  but
to  run  software  that  listens to the routing protocol.  In Berkeley
4.3, routing entries are removed when  TCP  connections  are  failing.
ARP  entries  are  still  not  removed.   This makes the default route
strategy more attractive for 4.3 than proxy ARP.  Having more than one
default  route  may  also allow for recovery from failure of a default
gateway.  Note however that 4.3 only handles timeout  for  connections
using TCP.  If a route is being used only by services based on UDP, it
will not recover from gateway failure.  While the "traditional" TCP/IP
services  use  TCP,  network  file  systems  generally  do  not.  Thus
4.3-based systems still  may  not  always  be  able  to  recover  from
failure.

In  general,  you  should  examine  your  implementation  in detail to
determine what sort of error recovery strategy it uses.  We hope  that
the  discussion in this section will then help you choose the best way
of dealing with routing.

There is one more strategy that some older implementations use.  It is
strongly  discouraged,  but we mention it here so you can recognize it
if you see it.  Some implementations detect gateway failure by  taking
active  measure to see what gateways are up.  The best version of this
is based on a list of all gateways that are currently in use.    (This
can  be  determined  from  the routing table.)  Every minute or so, an
echo request datagram is sent to each such  gateway.    If  a  gateway
stops responding to echo requests, it is declared down, and all routes
using it revert to the default.   With  such  an  implementation,  you
normally supply more than one default gateway.  If the current default
stops responding, an alternate is chosen.  In some cases,  it  is  not
even  necessary  to  choose an explicit default gateway.  The software
will  randomly  choose  any  gateway  that  is   responding.      This
implementation  is  very  flexible  and  recovers  well from failures.
However a large network full of such implementations will waste a  lot
of  bandwidth  on  the  echo  datagrams  that are used to test whether
gateways  are  up.    This  is  the  reason  that  this  strategy   is
discouraged.



5. Bridges and Gateways


This  section  will  deal  in  more detail with the technology used to
construct larger networks.  It  will  focus  particularly  on  how  to
connect  together  multiple  Ethernets,  token rings, etc.  These days
most networks are hierarchical.  Individual hosts attach to local-area
                                  24



networks  such  as  Ethernet or token ring.  Then those local networks
are connected via some combination of backbone networks and  point  to
point  links.    A  university might have a network that looks in part
like this:

     ________________________________
     |   net 1      net 2    net 3  |        net 4            net 5
     | ---------X---------X-------- |      --------         --------
     |                         |    |         |                 |
     |  Building A             |    |         |                 |
     |               ----------X--------------X-----------------X
     |                              |  campus backbone network  :
     |______________________________|                           :
                                                         serial :
                                                           line :
                                                         -------X-----
                                                             net  6

Nets 1, 2 and 3 are in one building.  Nets 4 and 5  are  in  different
buildings  on  the  same  campus.  Net 6 is in a somewhat more distant
location.  The diagram above shows nets 1, 2, and  3  being  connected
directly,  with switches that handle the connections being labelled as
"X".  Building A is connected to  the  other  buildings  on  the  same
campus  by  a backbone network.  Note that traffic from net 1 to net 5
takes the following path:

   - from 1 to 2 via the direct connection between those networks

   - from 2 to 3 via another direct connection

   - from 3 to the backbone network

   - across the backbone network from building A to  the  building  in
     which net 5 is housed

   - from the backbone network to net 5

Traffic  for  net  6 would additionally pass over a serial line.  With
the setup as shown, the same switch  is  being  used  to  connect  the
backbone  network  to net 5 and to the serial line.  Thus traffic from
net 5 to net 6 would not need to go through the backbone, since  there
is a direct connection from net 5 to the serial line.

This section is largely about what goes in those "X"'s.



5.1 Alternative Designs


Note  that  there  are alternatives to the sort of design shown above.
One is to use point to point lines or switched lines directly to  each
host.   Another is to use a single-level of network technology that is
capable of handling both local and long-haul networking.

                                  25



5.1.1 A mesh of point to point lines


Rather than connecting hosts to a local network such as Ethernet,  and
then   interconnecting  the  Ethernets,  it  is  possible  to  connect
long-haul serial lines directly to the individual computers.  If  your
network   consists   primarily  of  individual  computers  at  distant
locations, this might make sense.  Here would be  a  small  design  of
that type.

          computer 1                computer 2             computer 3
              |                         |                      |
              |                         |                      |
              |                         |                      |
          computer 4 -------------- computer 5 ----------- computer 6

In  the design shown earlier, the task of routing datagrams around the
network is handled by special-purpose switching units shown as  "X"'s.
If  you  run lines directly between pairs of hosts, your hosts will be
doing this sort of routing and switching,  as  well  as  their  normal
computing.    Unless  you  run  lines  directly  between every pair of
computers, some systems will end up handling traffic for  others.  For
example,  in this design, traffic from 1 to 3 will go through 4, 5 and
6.  This is certainly possible, since most TCP/IP implementations  are
capable of forwarding datagrams.  If your network is of this type, you
should think of your hosts as also acting as gateways.   Much  of  the
discussion  below  on  configuring  gateways will apply to the routing
software that you run on your hosts.  This sort  of  configuration  is
not as common as it used to be, for two reasons:

   - Most large networks have more than one computer per location.  In
     this case it is less expensive to set up a local network at  each
     location than to run point to point lines to each computer.

   - Special-purpose  switching  units have become less expensive.  It
     often makes sense to offload the routing and communications tasks
     to a switch rather than handling it on the hosts.

It is of course possible to have a network that mixes the two kinds of
techology.  In this case,  locations  with  more  equipment  would  be
handled  by  a hierarchical system, with local-area networks connected
by switches.  Remote locations with a single computer would be handled
by  point  to  point lines going directly to those computers.  In this
case the routing software used on the remote computers would  have  to
be  compatible  with that used by the switches, or there would need to
be a gateway between the two parts of the network.

Design decisions of this type are typically made after  an  assessment
of  the  level  of network traffic, the complexity of the network, the
quality of routing software available for the hosts, and  the  ability
of the hosts to handle extra network traffic.




                                  26



5.1.2 Circuit switching technology


Another  alternative  to  the hierarchical LAN/backbone approach is to
use circuit switches connected to each individual computer.   This  is
really  a  variant  of  the  point  to point line technique, where the
circuit switch allows each system to have what  amounts  to  a  direct
line to every other system.  This technology is not widely used within
the TCP/IP community, largely because the TCP/IP protocols assume that
the  lowest  level  handles  isolated  datagrams.    When a continuous
connection  is  needed,  higher  network  layers  maintain  it   using
datagrams.    This  datagram-oriented  technology  does  not  match  a
circuit-oriented environment very closely.  In order  to  use  circuit
switching  technology,  the IP software must be modified to be able to
build and tear down virtual circuits as appropriate.  When there is  a
datagram  for a given destination, a virtual circuit must be opened to
it.  The virtual circuit would  be  closed  when  there  has  been  no
traffic  to  that  destination  for  some time.  The major use of this
technology is for  the  DDN  (Defense  Data  Network).    The  primary
interface  to  the  DDN is based on X.25.  This network appears to the
outside as a distributed X.25 network.  TCP/IP software  intended  for
use with the DDN must do precisely the virtual circuit management just
described.     Similar   techniques   could   be   used   with   other
circuit-switching  technologies, e.g. ATT's DataKit, although there is
almost no software currently available to support this.



5.1.3 Single-level networks


In some cases new developments in wide-area networks can eliminate the
need  for hierarchical networks.  Early hierarchical networks were set
up because the only convenient  network  technology  was  Ethernet  or
other  LAN's, and those could not span distances large enough to cover
an entire campus.  Thus it  was  necessary  to  use  serial  lines  to
connect  LAN's  in  various  locations.    It  is now possible to find
network technology whose characteristics are similar to Ethernet,  but
where  a  single  network  can  span a campus.  Thus it is possible to
think of using a single large network, with no hierarchical structure.

The  primary  limitations  of  a  large   single-level   network   are
performance  and  reliability  considerations.  If a single network is
used  for  the  entire  campus,  it  is  very  easy  to  overload  it.
Hierarchical   networks  can  handle  a  larger  traffic  volume  than
single-level networks if traffic patterns have a reasonable amount  of
locality.  That is, in many applications, traffic within an individual
department tends to be greater than traffic among departments.

Let's look at a concrete example.  Suppose there are  10  departments,
each of which generate 1 Mbit/sec of traffic.  Suppose futher than 90%
of that traffic is to other systems within the  department,  and  only
10%  is to other departments.  If each department has its own network,
that network only needs to handle 1 Mbit/sec.   The  backbone  network
connecting  the  department also only needs 1 Mbit/sec capacity, since
                                  27



it is handling 10% of 1 Mbit from each department.  In order to handle
this  situation  with  a  single wide-area network, that network would
have  to  be  able  to  handle  the  simultaneous  load  from  all  10
departments, which would be 10 Mbit/sec.

The   second  limitation  on  single-level  networks  is  reliability,
maintainability and security.  Wide-area networks are  more  difficult
to  diagnose  and  maintain than local-area networks, because problems
can be introduced from any building to which the network is connected.
They  also  make traffic visible in all locations.  For these reasons,
it is often sensible to handle local  traffic  locally,  and  use  the
wide-area  network  only  for  traffic  that  actually must go between
buildings.  However if you have a situation where  each  location  has
only  one  or  two  computers, it may not make sense to set up a local
network at each location, and a single-level network may make sense.



5.1.4 Mixed designs


In practice,  few  large  networks  have  the  luxury  of  adopting  a
theoretically pure design.

It is very unlikely that any large network will be able to avoid using
a hierarchical design.  Suppose we  set  out  to  use  a  single-level
network.  Even if most buildings have only one or two computers, there
will be some location where there are enough that a local-area network
is justified.  The result is a mixture of a single-level network and a
hierachical network.  Most buildings have  their  computers  connected
directly  to  the  wide-area  network, as with a single-level network.
However in one building there is a local-area network which  uses  the
wide-area  network  as  a  backbone,  connecting to it via a switching
unit.

On the other side of the story, even network designers with  a  strong
commitment  to  hierarchical networks are likely to find some parts of
the network where it simply doesn't make economic sense to  install  a
local-area  network.    So  a  host  is put directly onto the backbone
network, or tied directly to a serial line.

However you should think carefully before  making  ad  hoc  departures
from  your  design  philosophy in order to save a few dollars.  In the
long run, network maintainability is going to depend upon your ability
to make sense of what is going on in the network.  The more consistent
your technology is, the more likely you are to be able to maintain the
network.








                                  28



5.2 An introduction to alternative switching technologies


This  section will discuss the characteristics of various technologies
used to switch datagrams between networks.  In effect, we  are  trying
to  fill  in  some  details  about the black boxes assumed in previous
sections.  There are three basic types of switches, generally referred
to as repeaters, bridges, and gateways, or alternatively as level 1, 2
and 3 switches (based on the level of the  ISO  model  at  which  they
operate).    Note however that there are systems that combine features
of more than one of these, particularly bridges and gateways.

The most important dimensions on which switches  vary  are  isolation,
performance, routing and network management facilities.  These will be
discussed below.

The most serious difference is between repeaters  and  the  other  two
types  of  switch.    Until recently, gateways provided very different
services from bridges.  However these two technologies are now  coming
closer  together.  Gateways are beginning to adopt the special-purpose
hardware that has characterized bridges in  the  past.    Bridges  are
beginning to adopt more sophisticated routing, isolation features, and
network management, which have characterized  gateways  in  the  past.
There  are  also systems that can function as both bridge and gateway.
This means that at the moment, the crucial  decision  may  not  be  to
decide  whether  to  use  a  bridge  or  a gateway, but to decide what
features you want in a switch  and  how  it  fits  into  your  overall
network design.



5.2.1 Repeaters


A repeater is a piece of equipment that connects two networks that use
the same technology.  It receives every data packet on  each  network,
and retransmits it onto the other network.  The net result is that the
two networks have exactly the same  set  of  packets  on  them.    For
Ethernet or IEEE 802.3 networks there are actually two different kinds
of repeater.  (Other network technologies may not need  to  make  this
distinction.)

A  simple  repeater  operates at a very low level indeed.  Its primary
purpose is to get around limitations in cable length caused by  signal
loss or timing dispersion.  It allows you to construct somewhat larger
networks than you would otherwise be able to construct.    It  can  be
thought  of  as  simply  a two-way amplifier.  It passes on individual
bits in the signal, without doing any processing at the packet  level.
It even passes on collisions.  That is, if a collision is generated on
one of  the  networks  connected  to  it,  the  repeater  generates  a
collision  on  the  other  network.  There is a limit to the number of
repeaters that you can use in a network.  The  basic  Ethernet  design
requires  that signals must be able to get from one end of the network
to the other within a specified amount of time.    This  determines  a
maximum  allowable length.  Putting repeaters in the path does not get
                                  29



around this limit.  (Indeed each repeater adds some delay, so in  some
ways  a repeater makes things worse.)  Thus the Ethernet configuration
rules limit the number of repeaters that can be in any path.

A "buffered repeater" operates at the level  of  whole  data  packets.
Rather  than passing on signals a bit at a time, it receives an entire
packet from one network into an internal buffer and  then  retransmits
it  onto  the other network.  It does not pass on collisions.  Because
such low-level features  as  collisions  are  not  repeated,  the  two
networks continue to be separate as far as the Ethernet specifications
are concerned.  Thus there  are  no  restrictions  on  the  number  of
buffered  repeaters  that can be used.  Indeed there is no requirement
that both of the networks be of  the  same  type.    However  the  two
networks  must  be sufficiently similar that they have the same packet
format.  Generally this means that  buffered  repeaters  can  be  used
between two networks of the IEEE 802.x family (assuming that they have
chosen the same address length), or two networks of some other related
family.    A  pair  of  buffered  repeaters can be used to connect two
networks via a serial line.

Buffered repeaters share with simple repeaters the most basic feature:
they  repeat every data packet that they receive from one network onto
the other.  Thus the two networks end up with exactly the same set  of
packets on them.



5.2.2 Bridges and gateways


A  bridge  differs from a buffered repeater primarily in the fact that
it exercizes some selectivity as to what packets it  forwards  between
networks.    Generally  the  goal  is  to increase the capacity of the
system by keeping local traffic confined to the network  on  which  it
originates.    Only  traffic  intended  for the other network (or some
other network accessed through it) goes through the bridge.    So  far
this  description would also apply to a gateway.  Bridges and gateways
differ in the way they determine what packets to forward.    A  bridge
uses  only  the  ISO level 2 address.  In the case of Ethernet or IEEE
802.x networks, this is the 6-byte Ethernet or MAC-level address. (The
term  MAC-level  address  is  more  general.   However for the sake of
concreteness, examples in this section will assume  that  Ethernet  is
being  used.    You  may generally replace the term "Ethernet address"
with the equivalent MAC-level address for other similar technologies.)
A bridge does not examine the packet itself, so it does not use the IP
address or its equivalent for  routing  decisions.    In  contrast,  a
gateway  bases  its decisions on the IP address, or its equivalent for
other protocols.

There are several reasons why it matters which kind of address is used
for  decisions.    The  most basic is that it affects the relationship
between the  switch  and  the  upper  layers  of  the  protocol.    If
forwarding is done at the level of the MAC-level address (bridge), the
switch will be invisible to the protocols.  If it is done  at  the  IP
level,  the  switch will be visible.  Let's give an example.  Here are
                                  30



two networks connected by a bridge:

              network 1          network 2
               128.6.5            128.6.4
        ==================  ================================
          |            |      |            |             |
       ___|______    __|______|__   _______|___   _______|___
       128.6.5.2        bridge       128.6.4.3     128.6.4.4
       __________    ____________   ___________   ___________
       computer A                   computer B    computer C


Note that the bridge does not have an IP address.  As far as computers
A,  B,  and  C  are  concerned,  there  is a single Ethernet (or other
network) to which they are all attached.  This means that the  routing
tables  must  be  set up so that computers on both networks treat both
networks as local.  When computer A opens a connection to computer  B,
it  first  broadcasts  an ARP request asking for computer B's Ethernet
address.  The bridge must  pass  this  broadcast  from  network  1  to
network  2.  (In general, bridges must pass all broadcasts.)  Once the
two computers know each other's Ethernet addresses, communications use
the  Ethernet  address  as the destination.  At that point, the bridge
can start exerting some selectivity.  It will only pass packets  whose
Ethernet  destination  address  is for a machine on the other network.
Thus a packet from B to A will be passed from network 2 to  1,  but  a
packet from B to C will be ignored.

In  order  to  make  this  selection,  the  bridge needs to know which
network each machine is on.  Most modern bridges build up a table  for
each  network,  listing the Ethernet addresses of machines known to be
on that network.  They do this by watching all of the packets on  both
networks.   When a packet first appears on network 1, it is reasonable
to conclude that the Ethernet source address corresponds to a  machine
on network 1.

Note  that a bridge must look at every packet on the Ethernet, for two
different reasons.  First, it may use  the  source  address  to  learn
which  machines  are  on  which  network.  Second, it must look at the
destination address in order to decide whether it needs to forward the
packet to the other network.

As  mentioned  above,  generally bridges must pass broadcasts from one
network to the other.  Broadcasts are often used to locate a resource.
The ARP request is a typical example of this.  Since the bridge has no
way of knowing what host is going to answer  the  broadcast,  it  must
pass   it   on  to  the  other  network.    Some  newer  bridges  have
user-selectable filters.  With them, it  is  possible  to  block  some
broadcasts  and  allow  others.  You might allow ARP broadcasts (which
are  essential  for  IP  to  function),  but  confine  less  essential
broadcasts  to one network.  For example, you might choose not to pass
rwhod broadcasts, which some systems use to keep track of  every  user
logged  into  every  other  system.    You  might  decide  that  it is
sufficient for rwhod to know about the systems on a single segment  of
the network.

                                  31



Now let's take a look at two networks connected by a gateway

              network 1                   network 2
               128.6.5                     128.6.4
        ====================      ==================================
          |              |          |              |             |
       ___|______    ____|__________|____   _______|___   _______|___
       128.6.5.2     128.6.5.1  128.6.4.1    128.6.4.3     128.6.4.4
       __________    ____________________   ___________   ___________
       computer A           gateway           computer B    computer C


Note  that  the  gateway  has IP addresses assigned to each interface.
The  computers'  routing  tables  are  set  up  to   forward   through
appropriate  address.    For  example,  computer A has a routing entry
saying that it should use the  gateway  128.6.5.1  to  get  to  subnet
128.6.4.

Because  the  computers  know  about the gateway, the gateway does not
need to scan all the packets on the Ethernet.  The computers will send
packets to it when appropriate.  For example, suppose computer A needs
to send a message to computer B. Its routing table will tell it to use
gateway  128.6.5.1.    It  will issue an ARP request for that address.
The gateway will respond to the ARP request, just as any  host  would.
From then on, packets destinated for B will be sent with the gateway's
Ethernet address.



5.2.3 More about bridges


There are several advantages to using  the  Mac-level  address,  as  a
bridge  does.   First, every packet on an Ethernet or IEEE network has
such an address.  The address is in the same place for  every  packet,
whether  it  is  IP,  DECnet,  or  some  other  protocol.   Thus it is
relatively fast to get the address from the packet.   A  gateway  must
decode  the  entire IP header, and if it is to support protocols other
than IP, it must have software for each such  protocol.    This  means
that  a bridge automatically supports every possible protocol, whereas
a gateway requires specific provisions for  each  protocol  it  is  to
support.

However  there  are  also disadvantages.  The one that is intrinsic to
the design of a bridge is

   - A bridge must look at every packet on the network, not just those
     addressed  to  it.    Thus it is possible to overload a bridge by
     putting it on a very busy network, even if very little traffic is
     actually going through the bridge.

However  there  are another set of disadvantages that are based on the
way bridges are usually built.  It is possible in principle to  design
bridges  that do not have these disadvantages, but I don't know of any
plans to do so.  They all stem from the fact that bridges do not  have
                                  32



a complete routing table that describes the entire system of networks.