[comp.sys.hp] Experience sought with large HP 9000 clusters

usenet@cps3xx.UUCP (Usenet file owner) (06/23/89)

We are beginning to set up a large network of HP 9000 machines running
HP-UX 6.5. Our target network, which we will be assembling over the course
of the next year, will have four cluster servers with approximately 30 cnodes
per cluster. I would like to hear from anyone who has done this sort of thing
with this number of machines.

The cluster servers will be 9000/360's with 12mb of memory and a fast
and a slow SCSI interface. We will put four 700mb disks on the fast SCSI
interface, an 8mm tape drive (Perfect Byte EXB-8200) on the slow SCSI
interface, and an HP QIC drive and a printer on the HPIB interface (the
console, a 700/92, is using the serial port).

The cnodes will mostly be 9000/340's with 8mb of memory and a 150mb HP7958B
disk on the HPIB interface. The cnodes will all be configured for local swap.
Applications include C and FORTRAN programming, Frame word processing, and
various engineering packages such as SDRC I-DEAS.

We'll have a sprinkling of other machines, including a 370/SRX and an 835,
which we'll problably run as standalone systems (on the network, but not
part of a cluster).

Questions: Is this a reasonable number of cnodes per cluster? Has anyone
experienced problems running out of process ids in a large cluster? Does
anyone have a workaround for the inability to put spooled devices, e.g.,
printers, on cnodes? Any advice? Horror stories?

Thanks for your help.
 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -  O
 John Lees                     A. H. Case Center for CAE/M     OoO
 UNIX Systems Manager          236 Engineering Building        /O
                               Michigan State University       |
 lees@frith.egr.msu.edu        East Lansing, MI 48824-1226    (|)   flower
 ...!uunet!frith!lees          Phone: (517) 355-7435/6453    __|__   power

rjn@hpfcdc.HP.COM (Bob Niland) (06/23/89)

re: "We are beginning to set up a large network of HP 9000 machines..."

> The cluster servers will be 9000/360's with 12mb of memory and a fast
> and a slow SCSI interface.

What is a "slow" SCSI interface.  I'm guessing you mean the 98658A plug-in
card vs the 98265A daughter board.  The plug in has a max speed of 2.5
Mbytes/sec., vs 4.0 for the daughter.

> ...and an HP QIC drive and a printer on the HPIB interface...

HP (HP-IB) cartridge tape drives use the 3M HCD format, not QIC.

> The cnodes will mostly be 9000/340's with 8mb of memory and a 150mb HP7958B
> disk on the HPIB interface. The cnodes will all be configured for local swap.

Is a high-speed HP-IB installed?  If not, you may discover that swapping
over the net is faster than via the standard speed built-in HP-IB.  Was
there any particular reason you are not using SCSI on the 340s?

Regards,                                              Hewlett-Packard
Bob Niland                                            3404 East Harmony Road
ARPA: rjn%hpfcrjn@hplabs.HP.COM                       Fort Collins
UUCP: [hplabs|hpu*!hpfcse]!hpfcla!rjn                 CO          80525-9599

wunder@hp-ses.SDE.HP.COM (Walter Underwood) (06/24/89)

   Questions: Is this a reasonable number of cnodes per cluster? Has anyone
   experienced problems running out of process ids in a large cluster? Does
   anyone have a workaround for the inability to put spooled devices, e.g.,
   printers, on cnodes? Any advice? Horror stories?

I don't know, but we run with about 10 nodes per cluster.  Also, I might
put more than 12 Meg in the rootserver.

It is fairly easy to put spooled devices on cnodes.  We use a named
pipe to get the bits over.  It lives in the filesystem, so it is
shared.  The spooler on the rootserver writes to the pipe, and a little
process on the cnode reads from it and dumps to the device.

I don't have any horror stories, but check out your ethernet carefully.
We had some loose connectors which did not cause trouble with Telnet
and FTP, but did show up with diskless.  Also, it is a very, very good
idae to have the entire cluster on the same copper.  If the segment is
unterminated, and the cnodes and rootserver can verify that by analog
means, they will wait for the cable to be reconnected.  If there is
a repeater in there, they cannot verify the problem and will suicide.

wunder

wunder@hp-ses.SDE.HP.COM (Walter Underwood) (06/24/89)

   wunder sez:
   It is fairly easy to put spooled devices on cnodes.

And here is the script we use.  We let the usual model script write to
the FIFO.  The FIFOs live in /usr/spool/lp/devices.

wunder

# This is a shell archive.  Remove anything before this line,
# then unpack it by saving it in a file and typing "sh file".
#
# Wrapped by Walter Underwood <wunder@hp-ses> on Fri Jun 23 14:06:30 1989
#
# This archive contains:
#	shuttle	
#

unset LANG

echo x - shuttle
cat >shuttle <<'@EOF'
#! /bin/ksh
################################################################################
#
# File:         shuttle.sh
# RCS:          $Header: shuttle.sh,v 1.1 88/04/18 17:01:48 tw Exp $
# Description:  Allow printers on cnodes other than the one with lpsched
# Author:       Tw Cook, HP Software Development Environments
# Created:      Mon Apr 18 15:43:53 1988
# Modified:     Tue May  3 09:21:59 1988 (Scott Kaplan) scott@hpsdel
# Language:     ksh
# Package:      N/A
# Status:       Experimental (Do Not Distribute)
#
# (c) Copyright 1988, Hewlett-Packard Company, all rights reserved.
#
################################################################################
#  Usage:  shuttle fifo device
# 
#  "fifo" is the name of a fifo which must be visible (i.e. not hidden inside
#  a CDF directory like /dev) to both the machine running lpsched and the
#  machine with the printer.   The fifo will be created if it isn't already
#  present.   This script will block until a file appears in the fifo; then
#  it will cat the file onto the printer, then it will block again.  On the
#  system running lpsched, simply install the printer as normal with the
#  fifo name as the device name.   On the machine with the printer, run
#  this script always (perhaps with a "respawn" from init).   Set the baud
#  rate below if different from 19200.
################################################################################

baudRate=9600

if [ "${1}" = "-b" ]
then
    baudRate=${2}
    shift ; shift
fi

if [ $# -ne 2 ]
then
    print Usage: ${0} fifo-name printer-name
fi

myName=${0}
fifoName=${1}
printerName=${2}

{
    if tty -s <&1
    then
       thisIsTty=1
    else
       thisIsTty=0
    fi

    function ttySet {
	if (( $thisIstty ))
	then
	    stty ignpar ixon cs8 ${baudRate} -opost <&1 2>/dev/null
	fi
    }

    function abortMsg {
	print "${myName}: printer fifo daemon on `hostname` killed.\n"
	date
	print "\f"
	ttySet
	exit 1
    }

    ttySet

    trap abortMsg 1 2 3 15

    if [ ! -d "`dirname ${fifoName}`" ]
    then
	mkdir `dirname ${fifoName}` || \
		{ print "Cannot make parent directory for fifo, ${fifoName}"; exit 2; }
    fi

    if [ ! -p ${fifoName} ]
    then
	rm -f ${fifoName}
	mknod ${fifoName} p
    #   chmod 600 ${fifoName}	# permission, ownership?
    fi

    while (( 1 ))
    do
	cat ${fifoName}
	ttySet			# does this really force output flush?
    done
} > ${printerName}
@EOF

chmod 755 shuttle

exit 0

paul@hpldola.HP.COM (Paul Bame) (06/24/89)

> Does
> anyone have a workaround for the inability to put spooled devices, e.g.,
> printers, on cnodes?

Note that named pipes work between cnodes on a cluster and a very small
amount of cleverness could possibly maybe perhaps make dandy local-spool
support.  I don't know how my company feels about this solution however....
(but it did what I needed).


	--Paul "Not representing HP" Bame

raveling@venera.isi.edu (Paul Raveling) (06/24/89)

In article <920029@hp-ses.SDE.HP.COM> wunder@hp-ses.SDE.HP.COM (Walter Underwood) writes:
>
>   Questions: Is this a reasonable number of cnodes per cluster? Has anyone
>   experienced problems running out of process ids in a large cluster? Does
>   anyone have a workaround for the inability to put spooled devices, e.g.,
>   printers, on cnodes? Any advice? Horror stories?
>
>I don't know, but we run with about 10 nodes per cluster.  Also, I might
>put more than 12 Meg in the rootserver.
>
...
>I don't have any horror stories, but check out your ethernet carefully.
...

	We're not exactly using a standard cluster configuration,
	but we have 8 9000/350's (make that 370's -- we're swapping
	boards this week) plus about about 13-15 9000/320's still
	in active use.  All are swapping on local disks, but are
	using a single 350 file server with 8 MB of RAM for most
	file storage.

	They share ethernet with about 50-60 Suns, Symbolics's, TI
	Explorers, and a few other machines, including VAXes.  Most
	of the Suns ARE running diskless.  I believe they've been
	repartitioning the ethernet in the course of remodeling our
	building to keep the network from getting out of hand.

	Anyway, it works.  The HP file server doesn't seem excessively
	loaded yet, and the ethernet's handling the load well enough.


----------------
Paul Raveling
Raveling@isi.edu

daveg@hpfcdc.HP.COM (Dave Gutierrez) (06/27/89)

<We are beginning to set up a large network of HP 9000 machines running
<HP-UX 6.5. Our target network, which we will be assembling over the course
<of the next year, will have four cluster servers with approximately 30 cnodes
<per cluster. I would like to hear from anyone who has done this sort of thing
<with this number of machines.
<
<The cluster servers will be 9000/360's with 12mb of memory and a fast

	A 370 would be nice, but the 360 is a sweet box.

<and a slow SCSI interface. We will put four 700mb disks on the fast SCSI
<interface, an 8mm tape drive (Perfect Byte EXB-8200) on the slow SCSI
<interface, and an HP QIC drive and a printer on the HPIB interface (the
<console, a 700/92, is using the serial port).

	Sounds reasonable. Increasing to 16Mb may be nice but may not be
	necessary. Define root server configurable parameters as follows:

		o num_cnodes 12	(you will just waste ram if larger)
		o nbuf    768	(maximum = 3Mb file-system buffer cache)
		o configure out all unnecessary device drivers or other 
			capabilities to conserve RAM (i.e. NFS, etc)
<
<The cnodes will mostly be 9000/340's with 8mb of memory and a 150mb HP7958B
<disk on the HPIB interface.

	I assume that you have a high-speed HPIB interface, if not you will
	be able to swap faster over the network. Your application will 
	certainly determine your mileage, but you may not need the local
	swap discs on all cnodes.

<The cnodes will all be configured for local swap.
<Applications include C and FORTRAN programming, Frame word processing, and
<various engineering packages such as SDRC I-DEAS.
<
<We'll have a sprinkling of other machines, including a 370/SRX and an 835,
<which we'll problably run as standalone systems (on the network, but not
<part of a cluster).

	no problem... I would however put two lan cards in the root-server
	and put all cnodes on a private thin-lan, using the root-server as
	a gateway to the facility backbone. Easier to troubleshoot, lan-break
	detection will still work, diskless traffic will be isolated, etc.
<
<Questions: Is this a reasonable number of cnodes per cluster?

	Given the described configuration, this should not be an issue.
	OF COURSE IT ALL DEPENDS on the applications.

< Has anyone experienced problems running out of process ids in a large cluster? Does

	The HP-UX implementation recycles pids across the cluster guaranteeing
	unique PIDS across the cluster. This should not be an issue.

<anyone have a workaround for the inability to put spooled devices, e.g.,
<printers, on cnodes? Any advice? Horror stories?

	Unsupported scripts have already been supplied in other responses.
	Sorry, no horror stories to contribute...
<
<Thanks for your help.
< - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -  O
< John Lees                     A. H. Case Center for CAE/M     OoO
< UNIX Systems Manager          236 Engineering Building        /O
<                               Michigan State University       |
< lees@frith.egr.msu.edu        East Lansing, MI 48824-1226    (|)   flower
< ...!uunet!frith!lees          Phone: (517) 355-7435/6453    __|__   power

bdale@col.hp.com (Bdale Garbee) (06/27/89)

>I would like to hear from anyone who has done this sort of thing
>with this number of machines.

We have several clusters with a lot of machines on them, where "a lot" is
defined as 16 or so.  I'll try to comment a bit.  Please recognize that I
am speaking from personal experience, not as a representative of HP... I write
instrument firmware for a living...

>The cluster servers will be 9000/360's with 12mb of memory and a fast
>and a slow SCSI interface. 

Not bad.  We tend to gravitate towards 350's and 370's as servers because of
a perception that the split bus architecture allows more DMA throughput to
I/O devices than on the 360.  Perhaps someone more authoritative will comment
on whether this is true or not.  We always configure servers with ECC RAM.
Even though it costs more, and parity errors are scarce, when one does happen
on the server the whole cluster is toast until it reboots.  They are rare
enough, that in your environment this may be a don't care.  Here, it's a
nightmare... emulator setups and such can be costly to reload/restart in terms
of engineering time.  We typically run 8meg of parity ram in clients, 16meg
for ME's and chip designers where the applications are large and hairy.  

>The cnodes will mostly be 9000/340's with 8mb of memory and a 150mb HP7958B
>disk on the HPIB interface. The cnodes will all be configured for local swap.

Tasty!  We run a mix of 320/350/360 clients.  The 320's are slow, everything
else seems more than ok.

>Is this a reasonable number of cnodes per cluster? 

It'll work.  Your expectations for disk performance may be much different
from ours, depending largely on the relationship between time spent compiling
and time spent sitting in an editor, or sitting in frame, or something else
that isn't I/O intensive.  We tend to limit ourselves to 16 seats per cluster,
with *nothing* running on the server except Sendmail, etc.  As long as the
load stays below 1 on the server, all seems quite pleasant.  You for sure
should configure your lan with a bridge per cluster, the server and clients
on their own thin strand... you should be ok.  And if you're not, come back
later and add another server or two, and move clients around.  120 clients on
a single strand is a bad idea.

>Has anyone experienced problems running out of process ids in a large cluster?

Not the way you mean.  We typically up the nproc and maxuprc (I think) params
in the client kernels to allow more processes than the default, since we used
to bang heads running X11 and lots of windows.  The defaults may be more
rational now, I don't know.  The global process number space seems to be large
enough, at least for our clusters.  Never had a problem...

>Does anyone have a workaround for the inability to put spooled devices, e.g.,
>printers, on cnodes? 

Sure.  Use a named pipe.  On the client, set up an inittab entry to cat stuff
from the named pipe to the physical device, on the server tell the spooler to
use the named pipe.  This is explicitly not supported, but local experience is
that it works ok... I forget who suggested this to me originally...  It should
also be possible to un-CDF /usr/lib/lpsched.  Easiest would be to go to the
server and cd to /usr/lib/lpsched+, then move remoteroot out of the way and
link it to localroot, which would allow the scheduler to run on the clients as
well.  Naming all of the printers differently within a cluster should handle
all of the possible conflicts... but I like the named pipe solution better
because you aren't dorking with something an OS update will break, and there's
only one copy of the scheduler to lose sleep over.

Bdale

bruce@hp-lsd.HP.COM (Bruce Mayes) (06/30/89)

>  Questions: Is this a reasonable number of cnodes per cluster?
>  [ ... ]

        Having been heavily involved in the characterization of HP's
        diskless workstation performance (a previous life), I have
        something which might interest you.

        Over the course of the last several years HP has spent
        significant amounts of time and money evaluating the performance
        of diskless clusters.  Our experience testing one possible
        application, the HP 64000-UX Microprocessor Development System,
        has been documented and is available for distribution.  This
        'Performance Brief' takes a look at stand-alone and diskless
        cluster performance of HP's MC68020 and MC68030-based
        workstations.

		NOTE:  This Performance Brief reviews diskless
		       performance as it pertains to the use
		       of HP 64000-UX tools.  This may or may
		       not be an accurate reflection of what
		       you notice running other applications
		       or tools on HP workstations!

	If you would like a copy of this Performance Brief please mail
	to:

		UUCP:      hp-lsd!Performance_Brief
		Internet:  Performance_Brief%hp-lsd@hplabs.HP.COM

	with the following information:

		Name:
		Company:
		Address:

		City:
		State/Country:
		Zip:
		Mail Stop:

        Please make sure you include a FULL, SURFACE MAIL ADDRESS in
        your note (including zip code).  I would also appreciate your
        help in paring down the response text.  A simple return address
        would suffice.  (We're automating as much of this as possible.)

        Please do NOT send mail to me!  I also cannot guarantee everyone
        will receive a copy.  If we become unundated with requests we
        may have to turn off the spigot.  I will, however, try to fill
        every request possible (within reason).

	Happy reading!


			-- Bruce Mayes
			   Logic Systems Division
			   Hewlett-Packard Company
			   Colorado Springs, CO USA

algoss@hpubmaa.HP.COM (Al Gosselin) (06/30/89)

You might also want to talk to Bruce Mayes at Logic Systems Division.
I don't know if he is still taking calls on this subject, but he wrote
a great performance guide for 64000-UX clusters. He talks about LAN
usage, multiple discs, nbuf, and a lot of other issues. You can also
get the guide from the sales group.

Al (Diskless is faster than stand-alone, usually!) Gosselin

perry@hpfcdc.HP.COM (Perry Scott) (06/30/89)

Re: 30 cnodes/cluster.

You might have physical problems with the LAN cable.  Each connector
causes attenuation, so the signal-to-noise may be pretty low at the end.
Avoid using barrel connectors for splicing - just use a longer cable.

We used a repeater, but as previously mentioned, cable breaks on the
server cable will cause cnodes on the other cables to go away.

Perry Scott

bdale@col.hp.com (Bdale Garbee) (07/14/89)

>You might have physical problems with the LAN cable.  Each connector
>causes attenuation, so the signal-to-noise may be pretty low at the end.
>Avoid using barrel connectors for splicing - just use a longer cable.

This is usually only a problem in practice if you use prefabbed cables, which
frequently are junk.  We buy high-quality coax, and crimp-on BNC connectors
that fit, and we've got runs that are pushing the length limits for thin
cable with a *lot* BNC connectors (at least one per cubicle down a long row,
for example)... it has a *lot* to do with how carefully you fab the
cables, etc.  We typically TDR new strands to look at the impedance down the
line... if you have the gear...

Bdale