[comp.unix.cray] How to use an SSD?

bernhold@qtp.ufl.edu (David E. Bernholdt) (05/29/90)

SSDs have been around on Crays for a long time.  I remember about 6
years ago working on a Cray (running CTSS) where you used the SSD as a
high-speed disk, but had to do some special things in the code to do
it -- system calls (my memory of the details is rather dim).  

I know SSDs are still around, but they don't seem to be used in the
same way any more.  I've poked around in the man pages, and there are
a few things related to SSDs, but not much.  I had always assumed that
the OS had been taught how to use the them & most of the stuff in the
man pages point in that direction.

On the other hand, in the Fall 1989 Cray Channels, there is an article
entitled "Designing effective out-of-core solutions" by Moshe Reshef,
which makes heavy use of SSDs -- apparently explicitly.  So now I've
become curious about a few things...

How are SSDs presently used (under UNICOS)?

It seems from the routines listed in the man pages that "extended main
memory" is the way to think of an SSD if one wishes to program for it
explicitly. From the article, though, they seem to be using a fast
disk model for it.  So how *do* you actually think of it?

Does anyone actually explicitly code for an SSD?  Is it worth it? 

Is explicitly coding for the SSD more or less efficient for the
particular program than letting the OS use the SSD as it wants?  How
about for the overall system performance (i.e. what is best in the
eyes of the people who own the Cray vs. what is best for a user who
just wants his program to run as-fast-as-possible for the least cost)?

Thanks in advance for helping satisfy my curiosity...
-- 
David Bernholdt			bernhold@qtp.ufl.edu
Quantum Theory Project		bernhold@ufpine.bitnet
University of Florida
Gainesville, FL  32611		904/392 6365

brooking@mcnc.org (Jim Brooking) (05/29/90)

(Qualifier: Responding person is only a manager....)

In article <1019@orange19.qtp.ufl.edu>, bernhold@qtp.ufl.edu (David E. Bernholdt) writes:

...
> How are SSDs presently used (under UNICOS)?

On our Y-MP8/432, SSD is used as a cache for disks. On my previous
X-MP/28, it was part cache, part user-accessible as a disk.

> It seems from the routines listed in the man pages that "extended main
> memory" is the way to think of an SSD if one wishes to program for it
> explicitly. From the article, though, they seem to be using a fast
> disk model for it.  So how *do* you actually think of it?

My understanding is that it is a disk, if the site chooses to allow
users at it in that way.
> 
> Does anyone actually explicitly code for an SSD?  Is it worth it? 

Yes, again, if the site allows it, one can explicitly code for an SSD.
However, it's probably easiest if the user thinks of SSD as a disk, and
just does I/O to it. This, in turn, is easiest if one already has a
program that uses one, or a few, small scratch files that will fit
in SSD. These programs will shine on a user-accessible SSD. But will
sink on a non-SSD machine.

> Is explicitly coding for the SSD more or less efficient for the
> particular program than letting the OS use the SSD as it wants?  How
> about for the overall system performance (i.e. what is best in the
> eyes of the people who own the Cray vs. what is best for a user who
> just wants his program to run as-fast-as-possible for the least cost)?

This is an application-dependent question. From my (Comp. Center
Manager) viewpoint, I'd rather have the SSD managed by "us" rather than
the users because, in general, we can probably do it better. (Flame
away...) Most of the applications I've seen do almost as well using SSD
as a (center-managed) disk cache rather than a user-managed file system.
There are significant exceptions, however. For example, a program that
does heavy random access of a file can take a significant performance
hit if the file is cached. My advice, then, is let the Center manage SSD
as a cache, but *make* the Center be alert for cases where cache doesn't
work.

Another point is that I've never seen any convincing evidence that a
particular cache setup was effective. Crayons have accurately pointed
out that "cache is operating with around 95% hit rate". This indicates
to me that there is ample SSD allocated to caching. But if one reduced
the available-to-cache SSD space by, say, 20%, would we still get 95%
hit rates? If we removed some file systems from caching, would we still
get 95%? If we put a portion of the SSD into the hands of some carefully
chosen users, would that be a better use of it? Noone knows (maybe the
Shadow knows...). Cray documentation of ldcache (what they call the SSD
cache facility) typically tells you how to invoke ldcache ("you"=the
system programmer) but but how or why, or how to tell when you got it
right. Or wrong.
> 
> Thanks in advance for helping satisfy my curiosity...

I hope so. Lots of opportunity for research with the SSD, tho.

-- 
>8-}     >:-)     %\(     8^)     :+/     |'[     ;-)     :-O     B^\    :-)
Jim Brooking........North Carolina Supercomputing Center.......(919)248-1145

djh@osc.edu (David Heisterberg) (05/29/90)

In article <1019@orange19.qtp.ufl.edu>, bernhold@qtp.ufl.edu (David E. Bernholdt) writes:
> How are SSDs presently used (under UNICOS)?

Of course, you're probably familiar with how we use ours, Dave.  We have
a 128MW SSD used as cache, and I believe swapping.  There are times that
I'd like to experiment with it - like using it for 2-electron integrals.
But for that case it's not really big enough to be used by more that one
person at a time.

I tend to to think that unless a very similar SSD-like device is available
on a wide variety of machines it is not worth the effort for end users to
put much time in optimizing a program for SSD usage.

Relatedly (and this should probably be a separate thread), I find it
difficult to make good use of asynchronous i/o.
-- 
David J. Heisterberg		djh@osc.edu		And you all know
The Ohio Supercomputer Center	djh@ohstpy.bitnet	security Is mortals'
Columbus, Ohio			ohstpy::djh		chiefest enemy.

eugene@wilbur.nas.nasa.gov (Eugene N. Miya) (06/04/90)

Most people will tell you that you use the SSD like a disk.

That's fine, but in this world that's kind of ambiguous.
I did a little work a few months back (for myself, Denning would
call it "tinkering") on trying to form some opinions on what and how to
balance the amount of work on a CPU and on the SSD.  You have to be
careful because the configurations of Crays vary immensely.

My suggestion is to write a simple benchmark.  Try and figure out
how much work you need to do on a CPU to avoid SSD thrashing.
Try to make certain your codes do more work than this.
Try to develop a feel how to structure you code before going off and
just doing it.

If you are really hurting for memory, and you find this all too much
work (writing for an SSD), you should consider upgrading to a Cray-2 8).

See, we told you so 8).  But no one ever listens to us.

--e. nobuo miya, NASA Ames Research Center, eugene@orville.nas.nasa.gov
  {uunet,mailrus,other gateways}!ames!eugene