[comp.arch] Cray & Amdahl Virtual memory debate

lm@laidbak.UUCP (Larry McVoy) (07/25/88)

In article <537@ns.UUCP> ddb@ns.UUCP (David Dyer-Bennet) writes:
>In article <5342@june.cs.washington.edu>, pardo@june.cs.washington.edu (David Keppel) writes:
>> Relevant (really?) question: Does it make more sense to buy a little
>> bit of very fast memory and slow it down with virtual memory, or to
>> buy a whole bunch of fast physical memory and slow it down by putting
>> it farther away?  (Assume: $ is no problem).  
>                     ^^^^^^^^^^^^^^^^^^^^^^^ 
>  Obviously :-) the correct solution is to buy a whole bunch of VERY
>fast physical memory.  A cache system (I consider virtual memory to be
>essentially a caching system) is never as fast as an entire main
>memory made out of that same technology.

Here's something else that we all may be passing over too lightly: 

    Virtual memory does more than offer a "solution" to the 
    too-large-code problem.  In addition VM provides

    + protection (Cray has a poor man's base & bounds)
    + relocation (ditto)
    + sparse address space (an example you all should know,
      and one Cray doesn't care too much for, is the standard
      unix process which is [text:data:heap->  :empty:  <- stack].
      Cray doesn't have the luxury of that empty space; I've heard
      that they interleave the heap & stack.  Gag.)
    + what I call VVM (virtual virtual memory hooks.  If you 
      want to simulate shared memory (across any communication
      channel) one way to do it is to use the MMU to catch writes
      and then schedule a flush.  (You can adjust the out-of-sync
      time by adjusting the delay to flush).  For more on this see 
      the Mach papers, they have tried something similar.

I'd be interested in seeing this discussion incorporate these issues
into the debate....  
-- 

Larry McVoy      (laidbak!lm@sun.com | ...!sun!laidbak!lm | 800-LAI-UNIX x286)

lm@laidbak.UUCP (Larry McVoy) (07/25/88)

Another thing occurs to me: Brian U of Cray said that their reason for no VM
is speed.   Someone else pointed out that there is a big difference
between a development shop (i.e., lots of small processes) and an
installation (i.e., one large Fortran process).  My take on this is that
VM may be a lose for the Fortran codes of the world but it is almost 
certainly a win for everyone else.  In other words, Cray is locking 
theirselves into the big iron Fortran club (and out of other places?).

Oh yeah, add demand paging (read: quick startup time) to my list in the
previous article.  Silly of me to forget that.

-- 

Larry McVoy      (laidbak!lm@sun.com | ...!sun!laidbak!lm | 800-LAI-UNIX x286)

lisper-bjorn@CS.YALE.EDU (Bjorn Lisper) (07/26/88)

In article <1535@laidbak.UUCP> lm@laidbak.UUCP (Larry McVoy) writes:
>Another thing occurs to me: Brian U of Cray said that their reason for no VM
>is speed.   Someone else pointed out that there is a big difference
>between a development shop (i.e., lots of small processes) and an
>installation (i.e., one large Fortran process).  My take on this is that
>VM may be a lose for the Fortran codes of the world but it is almost 
>certainly a win for everyone else.  In other words, Cray is locking 
>theirselves into the big iron Fortran club (and out of other places?).

"No VM" is not a win for Fortran per se, it is a win for a type of
applications that usually are written in Fortran, namely big
number-crunching scientific-code type applications. The dependence
structures of the algorithms implemented by such codes are usually quite
static and thus possible to detect either at compile-time by a very smart
compiler or *before* compile-time by a smart programmer. When the dependence
structure is known the memory handling can be laid out and optimized for the
particular program. This can be done by the aforementioned very smart
compiler or by the smart programmer provided that he is using a programming
language that allows him full control over the memory handling, for instance
Fortran.

Cray is not locking themselves into the big iron Fortran club. They are
locking themselves into the big iron scientific computing club.

Bjorn Lisper

cik@l.cc.purdue.edu (Herman Rubin) (07/26/88)

One of the examples given at the short course at Purdue on the CYBER 205
was the problem of multiplying two 1024x1024 matrices.  Because of the 
fact that there is a paged virtual memory, it was necessary to use a very
peculiar way of arranging the matrix elements to avoid thrashing.  On a
machine without virtual memory, assuming IO and computing can go on simul-
taneously, any competent programmer could arrange things so that, after
the initial loading of memory, very little time would be taken up by the
IO, and there would be no thrashing, and without using any non-obvious 
arrangement.

VM is a convenience for the programmer, but a problem for the machine if
much memory is involved, especially if paging is a problem.  It is also
a convenience if "standard" subroutines are to be kept in memory when
possible, but are removed if a space problem occurs.  A smart loader can
take care of this problem without VM, but do such exist?
-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet, UUCP)

lamaster@ames.arc.nasa.gov (Hugh LaMaster) (07/27/88)

In article <851@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes:
>
>One of the examples given at the short course at Purdue on the CYBER 205
>was the problem of multiplying two 1024x1024 matrices.  Because of the 
>fact that there is a paged virtual memory, it was necessary to use a very
>peculiar way of arranging the matrix elements to avoid thrashing.  On a
>machine without virtual memory, assuming IO and computing can go on simul-
>taneously, any competent programmer could arrange things so that, after

"Virtual Memory" is red herring here.  If there is not enough physical
memory to hold the array, the same (re)structuring of data is necessary
on either machine.  How that restructured data is mapped to code is
a matter of taste - some people prefer to rearrange arrays and do implicit
I/O, others prefer to do explicit I/O and use EXACTLY the same code on 
both machines.  There is at least one well known advantage for each case:

1)  The implicit I/O solution allows the program to run with progressively
    less and less I/O as the amount of memory on the system is increased,
    with no reprogramming required.

2)  The explicit I/O solution allows the program to run on a Cray 
    (most other machines have virtual memory) without changing ANYTHING.

I have seen it done both ways.  

Everyone recognizes that data has to reside in physical memory at the
moment it is used.  Nevertheless, there are many advantages to having
"virtual memory" on a system, including many performance advantages.  I
will refrain from restating them again, but, suffice it to say that the
only real disadvantage to virtual memory is the extra space in the CPU
that the memory mapping hardware consumes.  In my opinion, the many
advantages are worth the price.

-- 
  Hugh LaMaster, m/s 233-9,  UUCP ames!lamaster
  NASA Ames Research Center  ARPA lamaster@ames.arc.nasa.gov
  Moffett Field, CA 94035     
  Phone:  (415)694-6117