[comp.sys.apollo] Network layout

matt@bacchus.esa.oz (Matt Atterbury) (12/12/90)

    Hello all,

    Our Apollo's seem to be running pretty sluggishly at the moment,
    and I was wondering if it could be because of network overload. We
    do A LOT of s/w development using X (some people use DM but ALL
    applications are in X).

    We have about 4 disked 3550's and 6 diskless 3550/3500's, with
    about 6 more diskless to be added. We have 1 machine which stores
    almost all the binaries (even /$SYSTYPE/...); the other disked
    nodes store home directories, project stuff, news spool, etc (no
    real cohesion except that home directories are all on 1 node). An
    obvious bottleneck here is the machine which stores all the
    binaries - is this a bad idea? (BTW, this node is also the gateway
    to an Ethernet of SONY workstations).

    Q.  How do you lay everything out.

    I was wondering if a better idea might be to partition the network
    so that we have (say) 1 disked machine with binaries, home
    directories, etc and about 4 diskless nodes booting off it, all on
    one sub-ring, a similar set-up in another sub-ring, and the two
    disked nodes connected with a second ATR card. Hopefully this
    would minimise the cross-ring traffic while allowing everyone to
    do what they gotta do.

    Of course, this is all moot if the network load is actually low or
    irrelevant - any ideas on how to measure and guage it?

    many thanks and regards ...
--
-------------------------------------------------------------------------------
Matt Atterbury [matt@bacchus.esa.oz]      Expert Solutions Australia, Melbourne
UUCP: ...!uunet!munnari!matt@bacchus.esa.oz               "klaatu barada nikto"
  or: ...!uunet!murtoa!bacchus.esa.oz!matt            "consider this a divorce"
ARPA: matt%bacchus.esa.oz.AU@uunet.UU.NET  "life? don't talk to me about life!"

scalera@bnr.ca (Eric Scalera) (12/13/90)

In article <MATT.90Dec12082829@percy.bacchus.esa.oz> matt@bacchus.esa.oz (Matt Atterbury) writes:
>
>    Hello all,
>
>    Our Apollo's seem to be running pretty sluggishly at the moment,
>    and I was wondering if it could be because of network overload. We
>    do A LOT of s/w development using X (some people use DM but ALL
>    applications are in X).
>
>    We have about 4 disked 3550's and 6 diskless 3550/3500's, with
>    about 6 more diskless to be added. We have 1 machine which stores
>    almost all the binaries (even /$SYSTYPE/...); the other disked
>    nodes store home directories, project stuff, news spool, etc (no
>    real cohesion except that home directories are all on 1 node). An
>    obvious bottleneck here is the machine which stores all the
>    binaries - is this a bad idea? (BTW, this node is also the gateway
>    to an Ethernet of SONY workstations).
>
>    Q.  How do you lay everything out.
>
>    I was wondering if a better idea might be to partition the network
>    so that we have (say) 1 disked machine with binaries, home
>    directories, etc and about 4 diskless nodes booting off it, all on
>    one sub-ring, a similar set-up in another sub-ring, and the two
>    disked nodes connected with a second ATR card. Hopefully this
>    would minimise the cross-ring traffic while allowing everyone to
>    do what they gotta do.
>
>    Of course, this is all moot if the network load is actually low or
>    irrelevant - any ideas on how to measure and guage it?
>
>    many thanks and regards ...
>--
>-------------------------------------------------------------------------------
>Matt Atterbury [matt@bacchus.esa.oz]      Expert Solutions Australia, Melbourne
>UUCP: ...!uunet!munnari!matt@bacchus.esa.oz               "klaatu barada nikto"
>  or: ...!uunet!murtoa!bacchus.esa.oz!matt            "consider this a divorce"
>ARPA: matt%bacchus.esa.oz.AU@uunet.UU.NET  "life? don't talk to me about life!"

First of all an HP LANALYZER would be able to measure network load, also
if you running token-ring the netmain tools provide some useful info.
Chances are with only 10 node the network is not the problem.  Most
of the slowness is probably do to the X interface with the old domain
operating systems.  I would recommend upgrading to 10.3 ASAP.


-- 
Rick Scalera

#include <disclaimer.h>
"BNR does not share my opinions. I wouldn't be an MSS if it did"

honeywel@wayback.unm.edu (Honeywell Field Service) (12/14/90)

Regarding a slow network:  If you suspect hardware there are a couple of tools
that might point you in the right direction.  Take a look at some of the
statistics "netstat" provides, if you're seeing a lot of "tokens inserted"
you've got serious problems and this is probably why you're running slow.

Invoke "netstat -l -a -save (some filename)" this will build a history of
your nodes to date.  Then after a day or two (or in some cases an hour or two)
invoke "netstat -l -a -since (previously named file)" and this will compare
the current history to the one you saved and list the results.  Keep a sharp
eye for "tokens inserted", if you're seeing anything more than 2 or 3 on a
particular node this may be the indication of a problem.  Sometimes nodes will
cause tokens to be inserted when they've been powered off and on.  If a node
has added a lot of tokens, its time to check the BNC connectors and cabling.

Another tool for troubleshooting network problems is "probenet".  Just invoking
"probenet" doesn't pound on the ring hard enough to shake out problems.  Use
the "-s" option to specify a large number of packets to be transmitted.  I
usually start with "-s 100" just to test the ring.  To really pound the ring
set "-s" to 1000 and also tell probenet to repeat every few minutes with
the "-r" option.  

To physically check out your cabling and connectors save a netstat snapshot
of your network statistics and then kick off probenet with the -s option set
to a minimum of 100, repeating every 2-3 minutes.  Then walk around your ring
wiggling the connectors and cables at each node, checking the results of
probenet occasionally.  If your vibration testing of the cables and connectors
causes probenet or netstat to complain, you've found the problem or a potential
problem.

Most of the sys-admins on the large sites have learned the hard way about the
importance of the cabling and the BNC connectors.  You *must* use the proper
and approved cables.  The connectors, whether they are crimp or nut type must
be installed *perfectly*.  The quick-disconnects must be installed *perfectly*.
Find someone on your site that is obsessive about doing things right and let
them do all of your cabling/connectors.

When I was with Apollo and later while with Mentorgraphics, we found many 
"slow" rings or other network problems that were traced back to poorly
assembled BNC connectors or improper cables.  

Regards, Mike Thomas, Honeywell Third Party Services.