[comp.unix.wizards] Cray I/O

jfh@rpp386.Dallas.TX.US (John F. Haugh II) (06/02/89)

In article <4616@alvin.mcnc.org> spl@mcnc.org.UUCP (Steve Lamont) writes:
>A common criticism heard (and one that I made until I began working on
>interactive Crays some three years ago) is that you don't want to be running
>vi or emacs on a Cray.  The character level interrupts will kill the
>poor thing.  This is simply not true.  They simply do not happen
>frequently enough (from the machine's perspective) to worry about.

I have also heard that the overhead of sending single character I/O
requests over the hyperchannel was extremely costly.  The discussion
I've read claims that the amount of effort handling single character
at a time I/O [ like with Vi ] is too expensive in terms of CPU
cycles.

There is a posting in one of the sources groups which provides a Vi
frontend to ed running on a Cray.  I think it had a better description
of why Vi is a big loser on a supercomputer.

I was also going to disagree with Steve's remark vis memory
scheduling by pointing out that the machine has several hundred
megawords of physical memory.  I think that if you found the ratio
between computes [ in MIPS or MFLOPS ] and physical memory, you
might discover that compared to a 386PC with 8MB, a Cray is lacking
memory.  What do you think?
-- 
John F. Haugh II                        +-Button of the Week Club:-------------
VoiceNet: (512) 832-8832   Data: -8835  | "AIX is a three letter word,
InterNet: jfh@rpp386.Cactus.Org         |  and it's BLUE."
UucpNet : <backbone>!bigtex!rpp386!jfh  +--------------------------------------

shore@mtxinu.COM (Melinda Shore) (06/03/89)

[]
In article <16618@rpp386.Dallas.TX.US> jfh@rpp386.cactus.org (John F. Haugh II) writes:
>I have also heard that the overhead of sending single character I/O
>requests over the hyperchannel was extremely costly.  The discussion
>I've read claims that the amount of effort handling single character
>at a time I/O [ like with Vi ] is too expensive in terms of CPU
>cycles.

Yes, sending small packets over HyperChannel is very expensive, but
many sites are now using other media (notably Cray's FEI-3).  It's also
worth noting that HyperChannel is poorly suited to TCP/IP, but that's
another issue ...

Cray performance, at least on the Cray 1, the X-MP, and the Y-MP, is
far more hampered by limited main memory.  Until fairly recently you
couldn't get more than 8 megawords on an X and memory gets used up
pretty quickly.  Even the 32 MW available on the Y-MP is smallish when
you consider that you've probably got 8 processors and you've got
several hundred users running huge jobs.  Remember also that these are
word-oriented machines, and no instruction is smaller than 1 word (8
bytes).  Swapping performance used to be pretty awful too;  I hope
that's been fixed.

Back to I/O:  given the choice between editing the file on the Cray or
downloading it, doing the editing on a workstation, and uploading the
file back to the mainframe it would appear that it's cheaper to edit on
the Cray.  After all, the other way you're transferring the entire file
twice, which while probably not as expensive as typing the whole thing
in on the Cray in the first place is almost certainly more expensive
than making a typically limited number of edits.  At the same time, you
probably want to avoid the use of an editor like GNU Emacs, which has a
huge executable and apparently makes some obscene number of system
calls per keystroke (Unicos isn't multithreaded yet).  The best
solutions are probably a distributed editor like rvi or NSF mounting
your Cray directories on workstations and doing the editing there.

The whole question of the desirability of running Unix on a Cray is 
generally brought up by batch OS people who don't know Unix and by
Unix people who don't know supercomputing or the specifics of Cray's
Unix implementation, which doesn't look much like Unix anymore.
-- 
Melinda Shore                                     shore@mtxinu.com
Mt Xinu                                  ..!uunet!mtxinu.com!shore

shore@mtxinu.COM (Melinda Shore) (06/03/89)

In article <873@mtxinu.UUCP> shore@mtxinu.com (that's me) writes:
>[]
>Remember also that these are
>word-oriented machines, and no instruction is smaller than 1 word (8
>bytes).  

That's right, I goofed.  Cray instructions come in 16-bit parcel units,
and an instruction is either one or two parcels.  Binaries, however,
are still unexpectly large, and I don't think that my most egregious oops
invalidates the substance of what I was saying.
-- 
Melinda Shore                                     shore@mtxinu.com
Mt Xinu                                  ..!uunet!mtxinu.com!shore

spl@mcnc.org (Steve Lamont) (06/03/89)

In article <873@mtxinu.UUCP> shore@mtxinu.com (Melinda Shore) writes:
>several hundred users running huge jobs.  Remember also that these are
>word-oriented machines, and no instruction is smaller than 1 word (8
>bytes).  Swapping performance used to be pretty awful too;  I hope
>that's been fixed.

One minor correction and then we can probably either move this topic
elsewhere or give it a rest.  The instructions are variable length and
may be either one or two "parcels" in length.  A parcel is 16 bits long
and, obviously, there are 4 parcels per Cray word.  Parcels may span
word boundaries.

As far as swap performance goes, I'll have to leave that to the
performance junkies.  I like my Cray time stand alone, so I don't have
to worry about all those pesky users.  I don't get it that way... but I
do like it that way... :-)

Your other comments about hyperchannel (or Ultrabus) are apt.  These
devices like large blocks of data.  Small packets can be murder,
particularly if they have to compete with, say, a mass storage subsystem
of any sort, also on the hyperchannel.  However, the productivity
enhancement of doing editing (touch up kind -- the serious text editing
belongs on a workstation) is worth it.

-- 
							spl
Steve Lamont, sciViGuy			EMail:	spl@ncsc.org
North Carolina Supercomputing Center	Phone: (919) 248-1120
Box 12732/RTP, NC 27709

kjm@ut-emx.UUCP (06/03/89)

Melinda Shore writes:

> Cray performance, at least on the Cray 1, the X-MP, and the Y-MP, is
> far more hampered by limited main memory.  Until fairly recently you
> couldn't get more than 8 megawords on an X and memory gets used up
> pretty quickly.  Even the 32 MW available on the Y-MP is smallish when
> you consider that you've probably got 8 processors and you've got
> several hundred users running huge jobs.

No argument so far...

> Remember also that these are
> word-oriented machines, and no instruction is smaller than 1 word (8
> bytes).

On the X-MP, at least, instructions are composed of one or two 16-bit
"parcels", thus there are two to four instructions per word.

> Swapping performance used to be pretty awful too;  I hope
> that's been fixed.

It depends on exactly how the system is configured.  It is possible to
get fairly substantial performance improvements.  There are also lots
of ways to hang oneself here.

> The whole question of the desirability of running Unix on a Cray is 
> generally brought up by batch OS people

Or by people who prefer another (interactive) operating system (I know of
one person who would like to run VMS on a Cray).

Or by people (like me) who just plain think it can be done better.
(Although not by the likes of VMS...)

> who don't know Unix and by
> Unix people who don't know supercomputing or the specifics of Cray's
> Unix implementation, which doesn't look much like Unix anymore.

Actually, I have seen lots of code sequences in UNICOS 4.0 that look a
*whole lot* like the code I read in V6...

> -- 
> Melinda Shore                                     shore@mtxinu.com
> Mt Xinu                                  ..!uunet!mtxinu.com!shore

--
The above viewpoints are mine.  They are unrelated to those of
anyone else, including my wife, our cats, and my employer.

Kenneth J. Montgomery

kjm@hermes.chpc.utexas.edu          University of Texas System
kjm@cerberus.chpc.utexas.edu        Center for High Performance Computing

steve@nuchat.UUCP (Steve Nuchia) (06/04/89)

In article <873@mtxinu.UUCP> shore@mtxinu.com (Melinda Shore) writes:
>calls per keystroke (Unicos isn't multithreaded yet).  The best
>solutions are probably a distributed editor like rvi or NSF mounting
>your Cray directories on workstations and doing the editing there.

Distributed editors yes.  But mounting an NFS partition from
the cray and firing up a conventional  editor on the file will
result in its contents being copied across the wire in both
directions, which we were trying to avoid.  It is more convenient
for the user than using FTP (or a moral equivalent) twice, but
no less expensive for the super.
-- 
Steve Nuchia	      South Coast Computing Services
uunet!nuchat!steve    POB 890952  Houston, Texas  77289
(713) 964 2462	      Consultation & Systems, Support for PD Software.

mjt@super.ORG (Michael J. Tighe) (06/04/89)

In article <873@mtxinu.UUCP> shore@mtxinu.com (Melinda Shore) writes:
> Cray performance, at least on the Cray 1, the X-MP, and the Y-MP, is
> far more hampered by limited main memory.  Until fairly recently you
> couldn't get more than 8 megawords on an X and memory gets used up
> pretty quickly.
 
I am not so sure. I agree everybody wants more memory, but when each
of these machines came out (1/X/Y) the standard configuration came
with more memory than any other supercomputer (except the Cray-2)
available at that time. (there was a Cyber 205 that had 16 MW, but
that was a special case). A 16 MW X was available over 3 years ago.
Also they are constantly upgraded (X went from 4 MW to 64 MW, Y from
32 MW -> 128).
 
I think the real culprit is software. If you look at how much the
kernel (and utilities) have grown over the years, you will see where
all your memory has gone. By the time you get done with NFS, NQS, X11,
etc, there is no room left. I was reading the Standard C manual and it
said it needed 2 MW just to be built. Just a few years ago I booted
UNICOS on a 1 MW system. Now I can't even build a compiler on a 2 MW
system.
 
> At the same time, you probably want to avoid the use of an editor like
> GNU Emacs, which has a huge executable and apparently makes some
> obscene number of system calls per keystroke (Unicos isn't
> multithreaded yet).
 
Actually, the performance of GNU Emacs is not as bad as one might
think. You can lock it into a single CPU, and it can be compiled with
shared text. But this does not mean I believe you should turn your $25
million machine into a word processor.
 
> The best solutions are probably a distributed editor like rvi or NSF
> mounting your Cray directories on workstations and doing the editing
> there.
 
NFS seems to be a good choice, although it does have some security
problems. Also, by using Emacs and the function "compile-it" you can
execute your code on the Cray without ever leaving your editor or
logging on to the Cray.
-- 
-------------
Michael Tighe
internet: mjt@super.org
   uunet: ...!uunet!super!mjt

mjt@super.ORG (Michael J. Tighe) (06/04/89)

In article <16618@rpp386.Dallas.TX.US> jfh@rpp386.cactus.org (John F. Haugh II) writes:
> I have also heard that the overhead of sending single character I/O
> requests over the hyperchannel was extremely costly.
 
Yes it is inefficient. Hyperchannel was designed for bulk transfers,
not character oriented I/O. Therefore you run into lot of problems,
such as sporadic output and delays between keystrokes.
-- 
-------------
Michael Tighe
internet: mjt@super.org
   uunet: ...!uunet!super!mjt

mjt@super.ORG (Michael J. Tighe) (06/04/89)

In article <4623@alvin.mcnc.org> spl@mcnc.org.UUCP (Steve Lamont) writes:
> One minor correction and then we can probably either move this topic
> elsewhere or give it a rest.  The instructions are variable length and
> may be either one or two "parcels" in length.
 
Well, let's not give it a rest until we get it completely correct. The
Y-MP, can have 1, 2, or 3 parcel instructions.
 
> As far as swap performance goes, I'll have to leave that to the
> performance junkies.
 
There are a variety of ways to increase swapping performance, but I
think the first is to use a portion of your SSD as the swap device.
This will probably give the single biggest increase in performance.
-- 
-------------
Michael Tighe
internet: mjt@super.org
   uunet: ...!uunet!super!mjt

mjt@super.ORG (Michael J. Tighe) (06/04/89)

In article <9989@nuchat.UUCP> steve@nuchat.UUCP (Steve Nuchia) writes:
> But mounting an NFS partition from the cray and firing up a
> conventional  editor on the file will result in its contents being
> copied across the wire in both directions, which we were trying to
> avoid.
 
But the "work" is being done by the IOP's and that is what they are
there for. The Cray CPU's are not doing the work, which is what you
want. However, if you use an editor on the Cray, the CPU's are doing
the work. I guess the problem (for you) is one of trading CPU
performance vs network performance. I think if you look at the cost of
your CPU's vs the cost of your network, you will see that you should
be saving your CPU's...
-- 
-------------
Michael Tighe
internet: mjt@super.org
   uunet: ...!uunet!super!mjt