[comp.os.vms] local area vaxcluster performance

SSMITH@STMARYS.BITNET (10/23/87)

We have a Local area vaxcluster with a Vax 780 as a boot node with
and a micro-vax II and incurring a peformance problem across the cluster
The 780 has 3 Ra-81's and the micro-vax one for local paging and swapping.
All our applications programs and data are on the 780 disks.
The problem surfaces in this nature. When we run a I/O type job on the
micro-vax the Interrupt stack on the 780 goes to 60-70% of the cpu cycles
If a cpu intensive job ie compile/links is run on the micro-vax the effect
is not as great. Has anyone else experienced  a similar problem? If so any
suggestions to correct it.
Thanks in advance
Steve Smith
BITNET: SSMITH@STMARYS.BITNET

rrk@byuvax.bitnet (10/29/87)

Check your Ethernet device:  DEC didn't tell us until AFTER we installed
our LAVC that a Deuna is insufficient and inefficient.  But even after we
after-the-fact upgraded to a DEBNA, the performance still didn't get much
better (partly because of a dual CPU restriction) and we found out we have
more throughput on a stand-alone 8350 than with 3 micro-VAX II's clustered
in.

ward@cfa.harvard.EDU (Steve Ward) (11/06/87)

[eat me]

There are a limited number of configuration scenarios for "good"
performance on a multi-vax LAVC.

The general, and loose, definition of "good" in this case is that
the LAVC performs at least comparably to a mythical composite VAX
constructed of the same parts and features.  Example: a 3-node LAVC
consisting of MicroVAXes with one 300MB disk on each machine and
4MB of memory on each machine has a "composite" equivalent of a
mythical 2.8MIPS VAX with 900MB of disk and 12MB of memory.

The good news:  an LAVC can ALMOST achieve equivalent performance
to its mythical composite counterpart.

The bad news:  To achieve this near-equivalent performance the LAVC
must be configured with hardware and use patterns that fall within a
fairly narrow range of the possible configurations and use patterns.

Here is how you get the "good" performance out of an LAVC:

Configure each LAVC member with plenty of memory (16MB in MicroVAXes)
and plenty of local disk.  If the LAVC nodes are to be multiuser then
the local disk capacity will be sufficient for all users on this node,
as if the node was to be a standalone system.  All multiuser nodes will
be configured in this fashion, thus minimizing any Ethernet traffic
between LAVC nodes to a small amount of LAVC system traffic and any
file sharing between users on different LAVC nodes.

This results in "good" performance, but what you have is essentially
a collection of properly configured standalone computers that now have,
through the grace of the LAVC, an integrated file system and certain
other distributed features and functions.  For those who share tons of
data with colleagues, this is still a big deal.

The boot nodes(s) should not be burdened with multiuser accounts, doing
only various LAVC server (print, queue, etc) functions.

Another winning scenario is for a file server to single-user VAX work-
stations, such as VS2000's, GPX's, or VSII's.  In these cases the
workstations should have at least loca page/swap disk space.  The
server should have lots of memory and of course the necessary disk
space.

If you share your Ethernet LAN, then isolate the LAVC via a gateway or
a Bridge.

Losing configurations:

Any configuration that has a multiuser node(s) being file-served over
the Ethernet, meaning users logging and executing on machines other than
where their disk space resides, and especially when there is one
concentrated file server machine for one or more other multiuser LAVC
nodes.  Note that the ability to send an executable image to a
designated server/boot node is the "good" scenario (there are no
interactive user accounts on such nodes, as described above).  


All of this implies that pageing/swapping and otherwise pushing lots of
fileserving traffic over the Ethernet in an LAVC (even without page/swap
traffic) with multiuser nodes is a guaranteed way to get low(er) 
performance out of your VAXes.

Knowing these things can allow approaching LAVC's with realistic
expectations.  Certainly $$ can be saved and LAVC's can be "evolved"
easily from "barebones" to "full flesh" as the need and the pocket
book dictate.

In general on Ethernet and with LAVC (and probably with NFS and RFS),
fileserving works reasonably well from a properly configured filerserver
that is not loaded down with lots of users, fileserving to singleuser
workstations that are preferably NOT paging/swapping over the Ethernet.
Otherwise, in support of multiuser nodes, fileserving can provide
the glue for file sharing access that enhances shared data, but the
nodes should essentially be standalone in disk/memory condiguration so
that only actual user-user sharing of data causes Ethernet filesystem
activity within the LAVC.

Terminal servers can be used instead of hardwired serial ports, but
each multiuser node should have extra memory and derated by a few
concurrent logins to compensate for the extra terminal server overhead.

This has been a rather rambling commentary not meant to be highly
technical in nature.  Hopefully this will address many of the questions
raised about LAVC performance from an empirical view and spur technical
discussions among those who want quantitative answers.

Steven Ward
Smithsonian Astrophysical Observatory

d