[comp.unix.wizards] UNIX on Cray, COS, etc.

mike@BRL.ARPA (Mike Muuss) (03/17/88)

William -

In your recent message to UNIX-Wizards, you make some rather bold claims
that I would like to remark on.

>> Another thing I've heard is that UNICOS (Cray's UNIX) is HUGE and SLOW
>> (compared to COS); besides, are you really going to run a program that 
>> takes 5 hours (of CPU time) to run interactively?  We have a Cray X-MP/24
>> running COS at the UT System CHPC (Center for High Impedance Computing)
>> that is backed up for WEEKS on some of the larger job classes.  There are
>> some jobs which *couldn't* be run under UNICOS (on this machine) because
>> UNICOS would take up more memory than COS.

Cray's UNICOS for the XMP is neither "HUGE" nor "SLOW".  At BRL we operate
a Cray X-M/P48 with 3 CPUs of COS and 1 of UNICOS, and a Cray-2, in
addition to an assortment of other machines, and I believe I can offer
you a genuine datapoint. In terms of a performance difference between
COS and UNICOS, there isn't much.  Native UNICOS actually holds a small
but significant advantage in terms of I/O performance (typically around
10% faster file I/O, which rises to 1000% faster file I/O for
small-to-medium transfers to SSD).  This is a pretty neat result,
considering that lots of COS is hand-coded CAL (Cray Assembler), while
the UNICOS kernel remains almost entirely C.

Compute performance is not operating system-specific, but instead
compiler-specific, and Cray provides substantially the same compilers
under both systems.  Run times are typically very close. Differences in
runtimes, when they occur, are typically due to COS and UNICOS being at
slightly differing revision levels in the compilers.

The UNICOS kernel on our XMP is configured for a large load, and uses
176 Kwds total for it's resident image,  disk, terminal, and network buffers.
The balance of the memory is available for user problems.  Considering
this figure in BYTES, looking *today*, I find these numbers:

	VAX 4.3 kernel:		4.2 Mbytes resident (2.6 Mbytes of buffers)
	Gould UTX2.0:		2.3 Mbytes resident
	XMP UNICOS 2:		1.4 Mbytes resident, with buffers
	Sun-3/50 SUNOS3.4:	0.6 Mbytes resident (no disk drives)

These numbers are determined from kernel printf()s at boot time, or TOP,
and are not estimates.  I would say that the XMP UNICOS kernel stacks up
pretty well against it's slower brethren.  I would also like to
mention that COS (both 115 and 116) uses more memory than UNICOS.  I
don't have the figures handy (and rebooting the Cray just to get them
wouldn't make me very popular), but I remember it as being several 100
Kwds more.  Still not a "HUGE" difference.

Having said all that, I don't think that operating system size is enormously
important, as long as it isn't "too big".  One thing that I'm sure we can all
agree on is that XMPs don't have enough main memory, considering their
speed.  It's a lot like the old PDP-11/45 with 256Kbytes of bi-polar memory:
very fast (in it's day, for the price), but only enough memory for 2 or 3
sizeable compute-bound programs.

I'd also like to note that if your workload is entirely batch, then
there may not be any strong reason to run UNIX on your Cray.  However,
let me tell you that using UNIX on a Cray is pretty heady stuff.  Being
able to open an "XMP" window or a "Cray-2" window on my Sun, and have
the same Shells, screen editors, compilers, source code tools, TCP
networking, etc.etc.etc. as I have on my Suns, SGIs, VAXen, Goulds, and
Alliants is worth a lot to me.  Being able to "RSH" an image processing
command over to a Cray without having to put the files over on the Cray
first, or having to submit some silly batch job, is a really big win.
Consider doing an operation like this in any other environment;  only in
an all VAX/VMS+DECNET software environment do you stand a good chance --
but that isn't multi-vendor (or nearly as fast):

	pixinterp2x -s512  < image.pix |  \
	rsh Cray.arpa "pixfilter -s512 -lo" |  \
	rsh Alliant.arpa "pixmerge -n 63/0/127 -- - background.pix |  \
	rsh Vax.arpa "pixrot -r -i 1024 1024 | pix-fb -h"

Which roughly says:  grab an image on my local machine, perform
bi-linear interpolation locally, then send it to the Cray for
low-pass filtering, then send it to the Alliant for compositing
with my favorite background, then send it to a trusty VAX to
(a) rotate the image 180 degrees and (b) display it on the frame buffer.

This, by the way, is not an invented shell command.  These programs
really exist, and are commonly used in this way. Note how the image only
"touches" a disk drive once in the whole procedure. Perhaps less
important for this example, because the image at most stages is only 3
Mbytes, but this becomes more important when manipulating 400Mbyte image
data from NASA (which we have occasion to do, in exactly this way).
(By the way, this software is available at no cost, E-mail me for
details). Note that these procedures may take significant CPU time, but
you can be certain that I'll be paying careful attention to the screen
as my results arrive.  This >>could<< be done in batch, but then I
wouldn't have the opportunity to type ^C (SIGINT) in the middle if
something is going wrong.  Think of how much Cray time (and people
time!!) that might wind up saving you.

>> besides, are you really going to run a program that 
>> takes 5 hours (of CPU time) to run interactively?

For some programs, the answer is clearly "no".  Perhaps, if your program
was generating graphics describing it's progress "on the fly", you might
gain new insight into the problem that you are studying.  If you think
hard enough, almost anything can benefit from graphics.  Even watching
100000x100000 matrices being inverted is likely to improve your
understanding. You might gain a new understanding of the convergence
properties of your algorithm if you could sample every Nth iteration as
a picture on your screen.  Think about it.

In closing, I'd like to summarize by observing that COS isn't a "bad"
system, it just lacks lots of things that have come to be important.
Good interactivity, network access, and portable software are not easy
to do without in the fast-track business of "high-tech".

	Best,
	 -Mike Muuss
	  Ballistic Research Lab

PostScripts:

1)  Yes, I know what UT-2D is.  The result of Texan ingenuity striving
desperately to avoid COS's forefathers.

>> Scholars who study dinosaurs say there were some smart dinosaurs and lots
>> of stupid dinosaurs.  Those smart dinosaurs came along early, but in the
>> survival wars, please note, the stupid dinosaurs won.

2)  I'm sorry, I hadn't noticed that any dinosaurs really "won". Abusing
one of my favorite quotes seems appropriate: "Using TSO is like kicking
a dead dinosaur up the beach". And, speaking of saurians and TSO, have
you taken careful notice of of IBM's announcement about AIX (UNIX)?  It
looks like after many years, IBM may finally be offering their customers
some software that is as classy as their hardware.  We are flying out a
team next week to investigate.  (Proving that I too can ramble).

reeder@ut-emx.UUCP (William P. Reeder) (03/18/88)

In article <12452@brl-adm.ARPA>, mike@BRL.ARPA (Mike Muuss) writes:
> Compute performance is not operating system-specific, but instead
> compiler-specific, and Cray provides substantially the same compilers
> under both systems.
Well, operating systems have a definite affect on I/O (since programs
usually use system calls (perhaps indirectly, but at some level)) and also
on scheduling and management of the job mix.  This can have a profound 
effect on the throughput of your system.  You say that UNICOS turns out to
be faster than COS on I/O (about 10%), I'm very glad to hear that.

> The UNICOS kernel on our XMP is configured for a large load, and uses
> 176 Kwds total for it's resident image,  disk, terminal, and network buffers.
I guess Cray has done a lot of work lately, last time I heard (more than a
year ago, when we were considering running UNICOS part time on our X-MP/24)
it (UNICOS) took up more memory than COS.  I'm glad to know that it doesn't
anymore.

> Having said all that, I don't think that operating system size is enormously
> important, as long as it isn't "too big".  One thing that I'm sure we can all
> agree on is that XMPs don't have enough main memory, considering their
> speed.
I'll second that.  4 Megawords really isn't much when you start talking about
2D arrays in almost any problem.  I know you can get more memory, but it is
kind of like memory in the 8086/8088, difficult to cross the boundaries from
one chunk of 4 MW (on the Cray, not the Intel) to another.  We have a few
people who got upset when COS got bigger from 1.14 to 1.15, so they didn't
want UNICOS (back when we were told it was even bigger, no longer true).

> 	pixinterp2x -s512  < image.pix |  \
> 	rsh Cray.arpa "pixfilter -s512 -lo" |  \
> 	rsh Alliant.arpa "pixmerge -n 63/0/127 -- - background.pix |  \
> 	rsh Vax.arpa "pixrot -r -i 1024 1024 | pix-fb -h"
This is the kind of thing we want to drag (probably kicking and screaming
the whole way :-) our users into doing.  There might be a problem with
response time if everyone were using the Cray in this way (because of the
scheduling difficulties - not being able to delay some job for hours or
days (or even weeks, as seems to happen around here)), but we want to
move in that direction.

> Even watching
> 100000x100000 matrices being inverted is likely to improve your
> understanding. You might gain a new understanding of the convergence
> properties of your algorithm if you could sample every Nth iteration as
> a picture on your screen.  Think about it.
I agree, but alas, most users aren't interested in doing that sort of thing.
When I worked in our User Services division I spent the majority of my time
reading manuals to people over the phone or writing the short test programs
that they should have written.  Most of the programming (if you can call it
that :-) was done by overworked and underpaid graduate students who just
wanted to get something running.

> In closing, I'd like to summarize by observing that COS isn't a "bad"
> system, it just lacks lots of things that have come to be important.
> Good interactivity, network access, and portable software are not easy
> to do without in the fast-track business of "high-tech".
Yep.

> 	Best,
> 	 -Mike Muuss
> 	  Ballistic Research Lab
 
> And, speaking of saurians and TSO, have
> you taken careful notice of of IBM's announcement about AIX (UNIX)?  It
> looks like after many years, IBM may finally be offering their customers
> some software that is as classy as their hardware.  We are flying out a
> team next week to investigate.  (Proving that I too can ramble).
I went to a "non-disclosure" meeting about AIX (that means they don't
disclose anything to us).  I am pleased that they are working on UNIX and
they have some interesting extensions in development, but getting an IBM
to talk to an ASCII terminal is going to be somewhat of a problem.  I hope
they overcome it.  We are interested in AIX and are talking to IBM.

You can learn a lot by listenning to people ramble, if you have the patience.
That's why I read stuff on the net.

-- 
William {Wills,Card,Weekly,Virtual} Reeder	reeder@emx.utexas.edu

Scholars who study dinosaurs say there were some smart dinosaurs and lots
of stupid dinosaurs.  Those smart dinosaurs came along early, but in the
survival wars, please note, the stupid dinosaurs won.

DISCLAIMER:	I speak only for myself, and usually only to myself.