usenet@cps3xx.UUCP (Usenet file owner) (06/23/89)
We are beginning to set up a large network of HP 9000 machines running HP-UX 6.5. Our target network, which we will be assembling over the course of the next year, will have four cluster servers with approximately 30 cnodes per cluster. I would like to hear from anyone who has done this sort of thing with this number of machines. The cluster servers will be 9000/360's with 12mb of memory and a fast and a slow SCSI interface. We will put four 700mb disks on the fast SCSI interface, an 8mm tape drive (Perfect Byte EXB-8200) on the slow SCSI interface, and an HP QIC drive and a printer on the HPIB interface (the console, a 700/92, is using the serial port). The cnodes will mostly be 9000/340's with 8mb of memory and a 150mb HP7958B disk on the HPIB interface. The cnodes will all be configured for local swap. Applications include C and FORTRAN programming, Frame word processing, and various engineering packages such as SDRC I-DEAS. We'll have a sprinkling of other machines, including a 370/SRX and an 835, which we'll problably run as standalone systems (on the network, but not part of a cluster). Questions: Is this a reasonable number of cnodes per cluster? Has anyone experienced problems running out of process ids in a large cluster? Does anyone have a workaround for the inability to put spooled devices, e.g., printers, on cnodes? Any advice? Horror stories? Thanks for your help. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - O John Lees A. H. Case Center for CAE/M OoO UNIX Systems Manager 236 Engineering Building /O Michigan State University | lees@frith.egr.msu.edu East Lansing, MI 48824-1226 (|) flower ...!uunet!frith!lees Phone: (517) 355-7435/6453 __|__ power
rjn@hpfcdc.HP.COM (Bob Niland) (06/23/89)
re: "We are beginning to set up a large network of HP 9000 machines..." > The cluster servers will be 9000/360's with 12mb of memory and a fast > and a slow SCSI interface. What is a "slow" SCSI interface. I'm guessing you mean the 98658A plug-in card vs the 98265A daughter board. The plug in has a max speed of 2.5 Mbytes/sec., vs 4.0 for the daughter. > ...and an HP QIC drive and a printer on the HPIB interface... HP (HP-IB) cartridge tape drives use the 3M HCD format, not QIC. > The cnodes will mostly be 9000/340's with 8mb of memory and a 150mb HP7958B > disk on the HPIB interface. The cnodes will all be configured for local swap. Is a high-speed HP-IB installed? If not, you may discover that swapping over the net is faster than via the standard speed built-in HP-IB. Was there any particular reason you are not using SCSI on the 340s? Regards, Hewlett-Packard Bob Niland 3404 East Harmony Road ARPA: rjn%hpfcrjn@hplabs.HP.COM Fort Collins UUCP: [hplabs|hpu*!hpfcse]!hpfcla!rjn CO 80525-9599
wunder@hp-ses.SDE.HP.COM (Walter Underwood) (06/24/89)
Questions: Is this a reasonable number of cnodes per cluster? Has anyone experienced problems running out of process ids in a large cluster? Does anyone have a workaround for the inability to put spooled devices, e.g., printers, on cnodes? Any advice? Horror stories? I don't know, but we run with about 10 nodes per cluster. Also, I might put more than 12 Meg in the rootserver. It is fairly easy to put spooled devices on cnodes. We use a named pipe to get the bits over. It lives in the filesystem, so it is shared. The spooler on the rootserver writes to the pipe, and a little process on the cnode reads from it and dumps to the device. I don't have any horror stories, but check out your ethernet carefully. We had some loose connectors which did not cause trouble with Telnet and FTP, but did show up with diskless. Also, it is a very, very good idae to have the entire cluster on the same copper. If the segment is unterminated, and the cnodes and rootserver can verify that by analog means, they will wait for the cable to be reconnected. If there is a repeater in there, they cannot verify the problem and will suicide. wunder
wunder@hp-ses.SDE.HP.COM (Walter Underwood) (06/24/89)
wunder sez: It is fairly easy to put spooled devices on cnodes. And here is the script we use. We let the usual model script write to the FIFO. The FIFOs live in /usr/spool/lp/devices. wunder # This is a shell archive. Remove anything before this line, # then unpack it by saving it in a file and typing "sh file". # # Wrapped by Walter Underwood <wunder@hp-ses> on Fri Jun 23 14:06:30 1989 # # This archive contains: # shuttle # unset LANG echo x - shuttle cat >shuttle <<'@EOF' #! /bin/ksh ################################################################################ # # File: shuttle.sh # RCS: $Header: shuttle.sh,v 1.1 88/04/18 17:01:48 tw Exp $ # Description: Allow printers on cnodes other than the one with lpsched # Author: Tw Cook, HP Software Development Environments # Created: Mon Apr 18 15:43:53 1988 # Modified: Tue May 3 09:21:59 1988 (Scott Kaplan) scott@hpsdel # Language: ksh # Package: N/A # Status: Experimental (Do Not Distribute) # # (c) Copyright 1988, Hewlett-Packard Company, all rights reserved. # ################################################################################ # Usage: shuttle fifo device # # "fifo" is the name of a fifo which must be visible (i.e. not hidden inside # a CDF directory like /dev) to both the machine running lpsched and the # machine with the printer. The fifo will be created if it isn't already # present. This script will block until a file appears in the fifo; then # it will cat the file onto the printer, then it will block again. On the # system running lpsched, simply install the printer as normal with the # fifo name as the device name. On the machine with the printer, run # this script always (perhaps with a "respawn" from init). Set the baud # rate below if different from 19200. ################################################################################ baudRate=9600 if [ "${1}" = "-b" ] then baudRate=${2} shift ; shift fi if [ $# -ne 2 ] then print Usage: ${0} fifo-name printer-name fi myName=${0} fifoName=${1} printerName=${2} { if tty -s <&1 then thisIsTty=1 else thisIsTty=0 fi function ttySet { if (( $thisIstty )) then stty ignpar ixon cs8 ${baudRate} -opost <&1 2>/dev/null fi } function abortMsg { print "${myName}: printer fifo daemon on `hostname` killed.\n" date print "\f" ttySet exit 1 } ttySet trap abortMsg 1 2 3 15 if [ ! -d "`dirname ${fifoName}`" ] then mkdir `dirname ${fifoName}` || \ { print "Cannot make parent directory for fifo, ${fifoName}"; exit 2; } fi if [ ! -p ${fifoName} ] then rm -f ${fifoName} mknod ${fifoName} p # chmod 600 ${fifoName} # permission, ownership? fi while (( 1 )) do cat ${fifoName} ttySet # does this really force output flush? done } > ${printerName} @EOF chmod 755 shuttle exit 0
paul@hpldola.HP.COM (Paul Bame) (06/24/89)
> Does > anyone have a workaround for the inability to put spooled devices, e.g., > printers, on cnodes? Note that named pipes work between cnodes on a cluster and a very small amount of cleverness could possibly maybe perhaps make dandy local-spool support. I don't know how my company feels about this solution however.... (but it did what I needed). --Paul "Not representing HP" Bame
raveling@venera.isi.edu (Paul Raveling) (06/24/89)
In article <920029@hp-ses.SDE.HP.COM> wunder@hp-ses.SDE.HP.COM (Walter Underwood) writes: > > Questions: Is this a reasonable number of cnodes per cluster? Has anyone > experienced problems running out of process ids in a large cluster? Does > anyone have a workaround for the inability to put spooled devices, e.g., > printers, on cnodes? Any advice? Horror stories? > >I don't know, but we run with about 10 nodes per cluster. Also, I might >put more than 12 Meg in the rootserver. > ... >I don't have any horror stories, but check out your ethernet carefully. ... We're not exactly using a standard cluster configuration, but we have 8 9000/350's (make that 370's -- we're swapping boards this week) plus about about 13-15 9000/320's still in active use. All are swapping on local disks, but are using a single 350 file server with 8 MB of RAM for most file storage. They share ethernet with about 50-60 Suns, Symbolics's, TI Explorers, and a few other machines, including VAXes. Most of the Suns ARE running diskless. I believe they've been repartitioning the ethernet in the course of remodeling our building to keep the network from getting out of hand. Anyway, it works. The HP file server doesn't seem excessively loaded yet, and the ethernet's handling the load well enough. ---------------- Paul Raveling Raveling@isi.edu
daveg@hpfcdc.HP.COM (Dave Gutierrez) (06/27/89)
<We are beginning to set up a large network of HP 9000 machines running <HP-UX 6.5. Our target network, which we will be assembling over the course <of the next year, will have four cluster servers with approximately 30 cnodes <per cluster. I would like to hear from anyone who has done this sort of thing <with this number of machines. < <The cluster servers will be 9000/360's with 12mb of memory and a fast A 370 would be nice, but the 360 is a sweet box. <and a slow SCSI interface. We will put four 700mb disks on the fast SCSI <interface, an 8mm tape drive (Perfect Byte EXB-8200) on the slow SCSI <interface, and an HP QIC drive and a printer on the HPIB interface (the <console, a 700/92, is using the serial port). Sounds reasonable. Increasing to 16Mb may be nice but may not be necessary. Define root server configurable parameters as follows: o num_cnodes 12 (you will just waste ram if larger) o nbuf 768 (maximum = 3Mb file-system buffer cache) o configure out all unnecessary device drivers or other capabilities to conserve RAM (i.e. NFS, etc) < <The cnodes will mostly be 9000/340's with 8mb of memory and a 150mb HP7958B <disk on the HPIB interface. I assume that you have a high-speed HPIB interface, if not you will be able to swap faster over the network. Your application will certainly determine your mileage, but you may not need the local swap discs on all cnodes. <The cnodes will all be configured for local swap. <Applications include C and FORTRAN programming, Frame word processing, and <various engineering packages such as SDRC I-DEAS. < <We'll have a sprinkling of other machines, including a 370/SRX and an 835, <which we'll problably run as standalone systems (on the network, but not <part of a cluster). no problem... I would however put two lan cards in the root-server and put all cnodes on a private thin-lan, using the root-server as a gateway to the facility backbone. Easier to troubleshoot, lan-break detection will still work, diskless traffic will be isolated, etc. < <Questions: Is this a reasonable number of cnodes per cluster? Given the described configuration, this should not be an issue. OF COURSE IT ALL DEPENDS on the applications. < Has anyone experienced problems running out of process ids in a large cluster? Does The HP-UX implementation recycles pids across the cluster guaranteeing unique PIDS across the cluster. This should not be an issue. <anyone have a workaround for the inability to put spooled devices, e.g., <printers, on cnodes? Any advice? Horror stories? Unsupported scripts have already been supplied in other responses. Sorry, no horror stories to contribute... < <Thanks for your help. < - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - O < John Lees A. H. Case Center for CAE/M OoO < UNIX Systems Manager 236 Engineering Building /O < Michigan State University | < lees@frith.egr.msu.edu East Lansing, MI 48824-1226 (|) flower < ...!uunet!frith!lees Phone: (517) 355-7435/6453 __|__ power
bdale@col.hp.com (Bdale Garbee) (06/27/89)
>I would like to hear from anyone who has done this sort of thing >with this number of machines. We have several clusters with a lot of machines on them, where "a lot" is defined as 16 or so. I'll try to comment a bit. Please recognize that I am speaking from personal experience, not as a representative of HP... I write instrument firmware for a living... >The cluster servers will be 9000/360's with 12mb of memory and a fast >and a slow SCSI interface. Not bad. We tend to gravitate towards 350's and 370's as servers because of a perception that the split bus architecture allows more DMA throughput to I/O devices than on the 360. Perhaps someone more authoritative will comment on whether this is true or not. We always configure servers with ECC RAM. Even though it costs more, and parity errors are scarce, when one does happen on the server the whole cluster is toast until it reboots. They are rare enough, that in your environment this may be a don't care. Here, it's a nightmare... emulator setups and such can be costly to reload/restart in terms of engineering time. We typically run 8meg of parity ram in clients, 16meg for ME's and chip designers where the applications are large and hairy. >The cnodes will mostly be 9000/340's with 8mb of memory and a 150mb HP7958B >disk on the HPIB interface. The cnodes will all be configured for local swap. Tasty! We run a mix of 320/350/360 clients. The 320's are slow, everything else seems more than ok. >Is this a reasonable number of cnodes per cluster? It'll work. Your expectations for disk performance may be much different from ours, depending largely on the relationship between time spent compiling and time spent sitting in an editor, or sitting in frame, or something else that isn't I/O intensive. We tend to limit ourselves to 16 seats per cluster, with *nothing* running on the server except Sendmail, etc. As long as the load stays below 1 on the server, all seems quite pleasant. You for sure should configure your lan with a bridge per cluster, the server and clients on their own thin strand... you should be ok. And if you're not, come back later and add another server or two, and move clients around. 120 clients on a single strand is a bad idea. >Has anyone experienced problems running out of process ids in a large cluster? Not the way you mean. We typically up the nproc and maxuprc (I think) params in the client kernels to allow more processes than the default, since we used to bang heads running X11 and lots of windows. The defaults may be more rational now, I don't know. The global process number space seems to be large enough, at least for our clusters. Never had a problem... >Does anyone have a workaround for the inability to put spooled devices, e.g., >printers, on cnodes? Sure. Use a named pipe. On the client, set up an inittab entry to cat stuff from the named pipe to the physical device, on the server tell the spooler to use the named pipe. This is explicitly not supported, but local experience is that it works ok... I forget who suggested this to me originally... It should also be possible to un-CDF /usr/lib/lpsched. Easiest would be to go to the server and cd to /usr/lib/lpsched+, then move remoteroot out of the way and link it to localroot, which would allow the scheduler to run on the clients as well. Naming all of the printers differently within a cluster should handle all of the possible conflicts... but I like the named pipe solution better because you aren't dorking with something an OS update will break, and there's only one copy of the scheduler to lose sleep over. Bdale
bruce@hp-lsd.HP.COM (Bruce Mayes) (06/30/89)
> Questions: Is this a reasonable number of cnodes per cluster? > [ ... ] Having been heavily involved in the characterization of HP's diskless workstation performance (a previous life), I have something which might interest you. Over the course of the last several years HP has spent significant amounts of time and money evaluating the performance of diskless clusters. Our experience testing one possible application, the HP 64000-UX Microprocessor Development System, has been documented and is available for distribution. This 'Performance Brief' takes a look at stand-alone and diskless cluster performance of HP's MC68020 and MC68030-based workstations. NOTE: This Performance Brief reviews diskless performance as it pertains to the use of HP 64000-UX tools. This may or may not be an accurate reflection of what you notice running other applications or tools on HP workstations! If you would like a copy of this Performance Brief please mail to: UUCP: hp-lsd!Performance_Brief Internet: Performance_Brief%hp-lsd@hplabs.HP.COM with the following information: Name: Company: Address: City: State/Country: Zip: Mail Stop: Please make sure you include a FULL, SURFACE MAIL ADDRESS in your note (including zip code). I would also appreciate your help in paring down the response text. A simple return address would suffice. (We're automating as much of this as possible.) Please do NOT send mail to me! I also cannot guarantee everyone will receive a copy. If we become unundated with requests we may have to turn off the spigot. I will, however, try to fill every request possible (within reason). Happy reading! -- Bruce Mayes Logic Systems Division Hewlett-Packard Company Colorado Springs, CO USA
algoss@hpubmaa.HP.COM (Al Gosselin) (06/30/89)
You might also want to talk to Bruce Mayes at Logic Systems Division. I don't know if he is still taking calls on this subject, but he wrote a great performance guide for 64000-UX clusters. He talks about LAN usage, multiple discs, nbuf, and a lot of other issues. You can also get the guide from the sales group. Al (Diskless is faster than stand-alone, usually!) Gosselin
perry@hpfcdc.HP.COM (Perry Scott) (06/30/89)
Re: 30 cnodes/cluster. You might have physical problems with the LAN cable. Each connector causes attenuation, so the signal-to-noise may be pretty low at the end. Avoid using barrel connectors for splicing - just use a longer cable. We used a repeater, but as previously mentioned, cable breaks on the server cable will cause cnodes on the other cables to go away. Perry Scott
bdale@col.hp.com (Bdale Garbee) (07/14/89)
>You might have physical problems with the LAN cable. Each connector >causes attenuation, so the signal-to-noise may be pretty low at the end. >Avoid using barrel connectors for splicing - just use a longer cable. This is usually only a problem in practice if you use prefabbed cables, which frequently are junk. We buy high-quality coax, and crimp-on BNC connectors that fit, and we've got runs that are pushing the length limits for thin cable with a *lot* BNC connectors (at least one per cubicle down a long row, for example)... it has a *lot* to do with how carefully you fab the cables, etc. We typically TDR new strands to look at the impedance down the line... if you have the gear... Bdale