[net.arch] 128Mb - I give up!

john@frog.UUCP (John Woods, Software) (11/28/85)

It was claimed that having 128Mb of semiconductor memory means you don't
ever need to page or swap.  I tried to point out that that isn't true, but
my message seems not to have convinced anyone.  Fine, forget I said anything
at all.  But when you see "PANIC - NO MEMORY" on your console ("There's a
subroutine named FREE??????") don't say I didn't warn you.

jlg@lanl.ARPA (12/04/85)

The claim that I read was that 256MW (that's WORDS - as in 64 bits, not
those little 8 bit things) would eliminate the need for paging.  This
seems reasonable since 256MW is half the entire virtual memory space of
32 bit addressing schemes on VAX-like machines anyway (2GB).  The
machine under discussion was the Cray-2 which presents other problems
for the would-be memory management device: like disk is MUCH too slow to
be very useful as a page swap device.

Actually, I can't remember a time when the fastest machines on the
market had virtual memory.  Page swapping can, at best, improve
throughput (usually not).  Page swapping is almost guaranteed to degrade
turn-around of individual tasks.

J. Giles
Los Alamos

omondi@unc.UUCP (Amos Omondi) (12/05/85)

> 
> Actually, I can't remember a time when the fastest machines on the
> market had virtual memory.  Page swapping can, at best, improve
> throughput (usually not).  Page swapping is almost guaranteed to degrade
> turn-around of individual tasks.
> 
> J. Giles
> Los Alamos


The Cyber 203 & 205 which can outperform the Crays on most good
days do have virtual memory.

atbowler@watmath.UUCP (Alan T. Bowler [SDG]) (12/06/85)

In article <34249@lanl.ARPA> jlg@a.UUCP (Jim Giles) writes:
>The claim that I read was that 256MW (that's WORDS - as in 64 bits, not
>those little 8 bit things) would eliminate the need for paging.

This assumption was made for at least 1 operating system arround
already.  CP-6 doesn't page or swap and it gives nice fast responce
to a respectably large number of users.  However, I have heard complaints
that the 64 Megaword (36 bits) hardware limit was too restrictive,
and that people were running into it.  I haven't heard if those people were
happy with the newer hardware restriction of 256 megawords (on
more expensive bigger machines of course).

jlg@lanl.ARPA (12/07/85)

> > Actually, I can't remember a time when the fastest machines on the
> > market had virtual memory.  Page swapping can, at best, improve
> > throughput (usually not).  Page swapping is almost guaranteed to degrade
> > turn-around of individual tasks.
>
> The Cyber 203 & 205 which can outperform the Crays on most good
> days do have virtual memory.

I guess by 'good days' you mean those days when the only code you run is
for very long vectors in highly vectorized code or is code that has been
VERY carefully optimized for Cybers.  I've seen a lot of benchmarks of
both machines (I work with several different vintages of Crays on a
daily basis - and most of the people I work with are interested in only
one thing - SPEED).  The Cyber does very well on specific kinds of
problems involving long vectors.  It also does reasonably on codes that
have been carefully tailored for Cyber machines (ie. standard benchmark
sets like the 'Livermore Loops').  The Cyber does consistently worse
than Crays for short vectors, scaler code and code that hasn't been
recoded for the specific machine - this includes most production codes
at most of the major labs.

The problem is that vector setup time on Cybers is enormous.  You are
right that the asymtotic speed of Cybers is faster than the older Crays,
but that is only for brief sputs of pure vector code.  This extreme
vector setup time means that short vectors don't run very fast at all
(ie. multiplying two 3x3 matrices is not very efficient on Cybers).
Long vectors, where the pipeline time dominates, run very fast indeed.
Real production codes have a heterogenous mix of vector lengths, as well
as a lot of inherently scaler code for which the Cyber doesn't compete
well at all.

Meanwhile, vector setup time on the Cray is always short and predictable
even for data that is not contiguous and (with the X/MP) even for gather
scatters.  This means that short vectors (which constitute a large
proportion of many codes) run nearly as efficiently as long vectors.
Generally, for most codes with heterogenous mixes of vector lengths,
older Crays run slightly faster than Cybers - new Crays (X/MPs, Cray II)
run much faster.

The virtual memory is actually a large part of the speed degradation.
In order to run vectors efficiently, the vector must not span page
boundaries.  This means that each new vector operation must have it's
data moved around in memory so that page faults don't occur from the
vector unit.  If Cybers had very large central memory, instead of
virtual memory it would almost certainly be a faster machine (and would
therefore compete better than it has).

J. Giles
Los Alamos

wagle@iuvax.UUCP (12/07/85)

  Now people are claiming that 2 Gigabytes means you don't need virtual
memory.  I'd like to point out that while it probably would mean that your
program can be completely in main memory, you (at least I) would still want
the wonders of segmentation (read-only, shared, bounds checks).  Moreover,
it allows you to not page the entire program into memory from disk.  (Which
could be expensive and needless for my 1 Gigabyte program (:-)).

Perry Wagle, Indiana University, Bloomington Indiana.
...!ihnp4!inuxc!iuvax!wagle	(USENET)
wagle@indiana			(CSNET)
wagle%indiana@csnet-relay	(ARPA)

sambo@ukma.UUCP (Father of micro-ln) (12/07/85)

In article <696@unc.unc.UUCP> omondi@unc.UUCP (Amos Omondi) writes:
>The Cyber 203 & 205 which can outperform the Crays on most good
>days do have virtual memory.

Our experience has been that the Cyber 205 is 5 times slower than a Cray
(don't know if it's an XMP or 1) - hardly worth using for our application.
--
Samuel A. Figueroa, Dept. of CS, Univ. of KY, Lexington, KY  40506-0027
ARPA: ukma!sambo<@ANL-MCS>, or sambo%ukma.uucp@anl-mcs.arpa,
      or even anlams!ukma!sambo@ucbvax.arpa
UUCP: {ucbvax,unmvax,boulder,oddjob}!anlams!ukma!sambo,
      or cbosgd!ukma!sambo

	"Micro-ln is great, if only people would start using it."

dik@zuring.UUCP (12/08/85)

In article <696@unc.unc.UUCP> omondi@unc.UUCP (Amos Omondi) writes:
>
>The Cyber 203 & 205 which can outperform the Crays on most good
>days do have virtual memory.

But it frequently becomes a pain in the neck.
In a number of cases to get good performance you have to contort
your program so the VM system is bypassed.
-- 
dik t. winter, cwi, amsterdam, nederland
UUCP: {seismo,decvax,philabs,okstate,garfield}!mcvax!dik

mangoe@umcp-cs.UUCP (Charley Wingate) (12/09/85)

Maybe I just don't understand, but I do think there comes a point where
more memory just isn't going to help significantly.

Let's look at this in terms of pages.  With a proper replacement algorithm,
the paging rate is always going to decrease asymptotically toward zero as
the number of pages in memory increases.  Ignoring the overhead of context
switches, the "right" amount of memory is that in which the page faults are
generated no faster than the paging drives can handle them.  In practice,
some extra margin would be needed; some to reduce context swapping, and some
to reflect the statistics of paging better; by playing games with queuing
theory, a better number can be obtained.

I don't have any figures one me, but I suspect that the tremendous speed
differential between disk and CPU is going to set a high limit-- especially
when the statistics come into play.

Charley Wingate

brianu@ada-uts.UUCP (12/09/85)

>Meanwhile, vector setup time on the Cray is always short and predictable
>even for data that is not contiguous and (with the X/MP) even for gather
>scatters.
>J. Giles
>Los Alamos
>----------
Well, not exactly.  While I agree that the vector setup time on a Cray
XMP is always short, it is not always predictable.  Due to the
indeterminate nature of the memory conflicts between XMP processors,
the setup times can vary, not to mention that gather scatters can
interfere with themselves.  I think the XMP hardware manual states that
the setup is unpredictable; I know the instructors at Cray Software
Training did.
   By the way, Cray Research did a marketing survey and asked their
customers if they prefered large amounts of slightly slower memory or
somewhat less faster memory.  The consensus (reflected in the Cray 2)
was for more memory.

Brian Utterback
Intermetrics Inc.
733 Concord Ave. Cambridge MA. 02138. (617) 661-1840
UUCP: {cca!ima,ihnp4}!inmet!ada-uts!brianu
LIFE: UCLA!PCS!TELOS!CRAY!I**2

omondi@unc.UUCP (Amos Omondi) (12/09/85)

> > > Actually, I can't remember a time when the fastest machines on the
> > > market had virtual memory.  Page swapping can, at best, improve
> > > throughput (usually not).  Page swapping is almost guaranteed to degrade
> > > turn-around of individual tasks.
> >
> > The Cyber 203 & 205 which can outperform the Crays on most good
> > days do have virtual memory.
> 
> I guess by 'good days' you mean those days when the only code you run is
> for very long vectors in highly vectorized code or is code that has been
> VERY carefully optimized for Cybers.  I've seen a lot of benchmarks of
> both machines (I work with several different vintages of Crays on a
> daily basis - and most of the people I work with are interested in only
> one thing - SPEED).  The Cyber does very well on specific kinds of
> problems involving long vectors.  It also does reasonably on codes that
> have been carefully tailored for Cyber machines (ie. standard benchmark
> sets like the 'Livermore Loops').  The Cyber does consistently worse
> than Crays for short vectors, scaler code and code that hasn't been
> recoded for the specific machine - this includes most production codes
> at most of the major labs.
> 
> The problem is that vector setup time on Cybers is enormous.  You are
> right that the asymtotic speed of Cybers is faster than the older Crays,
> but that is only for brief sputs of pure vector code.  This extreme
> vector setup time means that short vectors don't run very fast at all
> (ie. multiplying two 3x3 matrices is not very efficient on Cybers).
> Long vectors, where the pipeline time dominates, run very fast indeed.
> Real production codes have a heterogenous mix of vector lengths, as well
> as a lot of inherently scaler code for which the Cyber doesn't compete
> well at all.
> 
> Meanwhile, vector setup time on the Cray is always short and predictable
> even for data that is not contiguous and (with the X/MP) even for gather
> scatters.  This means that short vectors (which constitute a large
> proportion of many codes) run nearly as efficiently as long vectors.
> Generally, for most codes with heterogenous mixes of vector lengths,
> older Crays run slightly faster than Cybers - new Crays (X/MPs, Cray II)
> run much faster.
> 
> The virtual memory is actually a large part of the speed degradation.
> In order to run vectors efficiently, the vector must not span page
> boundaries.  This means that each new vector operation must have it's
> data moved around in memory so that page faults don't occur from the
> vector unit.  If Cybers had very large central memory, instead of
> virtual memory it would almost certainly be a faster machine (and would
> therefore compete better than it has).
> 
> J. Giles
> Los Alamos


That was exactly what i meant by 'good days'; this also includes
those days when you don't need 64-bit arithmetic and can double
your perfomance on the Cybers using 32-bit arithmetic. The
general argument is of course valid; most of the changes from
the STAR-100 to the Cyber 205 were aimed at improving the perfomance
on scalars and short vectors. True comparisons of the Crays and Cybers
are generally not possible since the machines perform at their
peaks on different problems. My experience (which i admit was
somewhat limited) was that the Cybers generally performed better
on numerical linear algebra stuff.

I also make comparisons according to "supercomputer generation":
		Cybers    vs.   Cray-1s
		ETAs (yet to appear machine) vs. Cray-2 & Cray-XMP
On the whole i would rather own a Cyber....

omondi@unc.UUCP (Amos Omondi) (12/09/85)

> In article <696@unc.unc.UUCP> omondi@unc.UUCP (Amos Omondi) writes:
> >
> >The Cyber 203 & 205 which can outperform the Crays on most good
> >days do have virtual memory.
> 
> But it frequently becomes a pain in the neck.
> In a number of cases to get good performance you have to contort
> your program so the VM system is bypassed.
> -- 
> dik t. winter, cwi, amsterdam, nederland
> UUCP: {seismo,decvax,philabs,okstate,garfield}!mcvax!dik

Obviously, all things being equal ( or more or less equal ) any
machine with virtual memory will perform worse than one without
virtual memory. The point i was trying to make, in answer to the
original statement by one J. Giles, is that the Cybers are
among todays fastest machines AND they do have virtual memory.

We all must be able to accept some pain...

eugene@ames.UUCP (Eugene Miya) (12/16/85)

> peaks on different problems. My experience (which i admit was
> somewhat limited) was that the Cybers generally performed better
> on numerical linear algebra stuff.
> 
> I also make comparisons according to "supercomputer generation":
> 		Cybers    vs.   Cray-1s
> 		ETAs (yet to appear machine) vs. Cray-2 & Cray-XMP
> On the whole i would rather own a Cyber....
> 

Hum [I've been off the net for a while], we own all of these machines:
2 X-MPs, a 4 pipe 205 and a 2.  I would rather own a Cray.  I suggest
you read the article in Science about the von Neuman Computer Center
and the hot water they got into about switching their purchase
from a Cyber to a Cray.  It might all be moot shortly as the Japanese
machines have caught up and in some cases exceeded the American
machines (including quality software).  Also your comment about
tuned benchmarks [you used the LLNL loops as an example] is not
entirely true.  That is why I based some of my research on them.
BTW, I have not forgotten to send you the NEC paper, it's that I
got back from Washington DC only to have to attend meetings else where.
Back to ETA/CDC: the question will arise in the near future what
will we have to do if ETA cannot make the ETA-10?  Do we give them
a Chrysler (sp) bailout?  Is such a thing possible?

--eugene miya