[net.arch] Why Virtual Memory

mo@seismo.CSS.GOV (Mike O'Dell) (10/21/85)

Why virtual memory when physical memories are getting larger?

Protection and resource allocation.

Mapping two address spaces disjointly is a very easy way to insure
two processes don't get in each other's way.  It also provides a very
clean way to do controlled sharing of memory.

Allocation control for large physical memories is as difficult a task
as ever, and resource allocators which deal with fixed-size pieces
(pages) are simpler and more efficient than those which deal with
varying size pieces.

For these reasons, at least, virtual memory is important in future
architectures, particularly with large physical memories.

Finally, for my money, virtual memory has never been good for providing
"free lunch" - ie, running programs larger than physical memory,
EXCEPT when the size of the program and its reference patterns are
such that memory allocation efficiency becomes a dominant factor.
As the gent from Livermore would admit, his giant Monte-Carlo simulation
probalby doesn't have well-behaved addressing locality, but rather
references all its memory all the time.  Programs like these need
Virtual-to-Real ratios close to 1.  

	TANSTAAFGPVSSA
	-Mike O'Dell

PS - There Ain't No Such Thing As A Fast General-Purpose Variable-Size
Storage Allocator

rentsch@unc.UUCP (Tim Rentsch) (10/24/85)

It is interesting to note that 10 years ago or so, all large systems
had virtual memory whereas small systems did not.

Now the largest systems (e.g., Cray 2) do not have virtual memory,
whereas it is more and more common for small systems ("microprocessors", 
and I use the term in quotes) to have virtual memory.

I wonder if in another ten years the "small" systems won't have
virtual memory, but the "large" (i.e., gigantic) systems will again?

The "wheel of reincarnation" turns ....

chuck@dartvax.UUCP (Chuck Simmons) (10/24/85)

> Why virtual memory when physical memories are getting larger?
> 
> Protection and resource allocation.

Another reason for virtual memory is that segmented architechtures can
make programming easier.  For example, some programs want to have multiple
stacks (eg our PL1 compiler).  By setting up each stack as a segment,
the compiler programmer can easily and efficiently allocate storage on
any stack.  Our current pl1 compiler, written before we had a segmented
architechture, must spend a lot of time worrying about stack collisions.
When one of these occur, it must shuffle the stacks around in core to move
them away from each other.

-- Chuck

rcd@opus.UUCP (Dick Dunn) (10/24/85)

> Why virtual memory when physical memories are getting larger?
> 
> Protection and resource allocation.
> 
> Mapping two address spaces disjointly is a very easy way to insure
> two processes don't get in each other's way...

These aren't related to virtual memory.  They are arguments for memory
protection; specifically, protection which works in a fashion independent
of physical address.  Other protection methods (non-paging) will meet these
desires.

(The other arguments of the parent article are OK, tho.)
-- 
Dick Dunn	{hao,ucbvax,allegra}!nbires!rcd		(303)444-5710 x3086
   ...At last it's the real thing...or close enough to pretend.

omondi@unc.UUCP (Amos Omondi) (10/27/85)

> 
> It is interesting to note that 10 years ago or so, all large systems
> had virtual memory whereas small systems did not.
> 
> Now the largest systems (e.g., Cray 2) do not have virtual memory,
> whereas it is more and more common for small systems ("microprocessors", 
> and I use the term in quotes) to have virtual memory.
> 
> I wonder if in another ten years the "small" systems won't have
> virtual memory, but the "large" (i.e., gigantic) systems will again?
> 
> The "wheel of reincarnation" turns ....
> 


In taking the Cray 2 as an example, one should take historical, philosophical
, etc. considerations into account. The CDC 6600, CDC 7600, CRAY 1, and
CRAY 2 do not have virtual memory; and Seymour Cray was largely responsible
for their designs. Other CDC machines, inculding the Cyber 200 series which
are the in Cray1-Cray2 perfomance range, have virtual memory as do several
of the new Japanese supercomputers .

henry@utzoo.UUCP (Henry Spencer) (10/27/85)

> It is interesting to note that 10 years ago or so, all large systems
> had virtual memory whereas small systems did not.
> 
> Now the largest systems (e.g., Cray 2) do not have virtual memory,
> whereas it is more and more common for small systems...
> to have virtual memory.

Virtual memory has always meant some speed penalty, although clever design
can minimize it.  Even 10-year-old big machines run more quickly with
address translation switched off, as witness IP Sharp [big APL timesharing
firm] which runs its monster Amdahl unmapped and sees about a 15% speed
improvement as a result.  (They can get away with this because they run no
directly-executable user code.)  Machines specializing in absolute maximum
speed generally will not use virtual memory and hence will often be built
without it.  Machines running more general applications will have it if
they can afford it, which nowadays means they will almost always have it.
The pattern is not a wheel of reincarnation, it's gradual diffusion of the
technology downward coupled with falling memory prices and the realization
that "real memory for real performance" dictates avoiding virtual memory
when speed totally dominates design.
-- 
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,linus,decvax}!utzoo!henry

rcd@opus.UUCP (Dick Dunn) (10/30/85)

> It is interesting to note that 10 years ago or so, all large systems
> had virtual memory whereas small systems did not.
> 
> Now the largest systems (e.g., Cray 2) do not have virtual memory,
> whereas it is more and more common for small systems ("microprocessors", 
> and I use the term in quotes) to have virtual memory.

If you mean by "virtual memory" something like paging capability--or in
particular, hardware support for a logical address space larger than the
physical address space, it is NOT true that all large systems had virtual
memory 10 years ago.

The Cray 2 is not markedly different with respect to memory address mapping
than the Cray 1, the CDC 7600, or the CDC 6600--each the fastest commercial
machine of its day.  The 6600 takes us back about 20 years.

Quite simply, in machines as large as these, you cannot pretend that you
have memory that isn't really there, for the class of problem they tend to
be used to solve.
-- 
Dick Dunn	{hao,ucbvax,allegra}!nbires!rcd		(303)444-5710 x3086
   ...At last it's the real thing...or close enough to pretend.

gnu@l5.uucp (John Gilmore) (10/31/85)

In article <6086@utzoo.UUCP>, henry@utzoo.UUCP (Henry Spencer) writes:
> Virtual memory has always meant some speed penalty, although clever design
> can minimize it.  Even 10-year-old big machines run more quickly with
> address translation switched off, as witness IP Sharp [big APL timesharing
> firm] which runs its monster Amdahl unmapped and sees about a 15% speed
> improvement as a result.  (They can get away with this because they run no
> directly-executable user code.)

Any machine has some overhead due to updating page table entries -- in
hardware, or in software, or both.  You get what you pay for.

Sharp used to run DOS/360 (heavily hacked) on their Amdahl, since their
APL time sharing system was written to timeshare efficiently on the 360
which did not have address translation or paging (remember those days??).
They ported their APL to MVS a few years ago because a lot of customers wanted
to run it that way, and eventually converted their data centre to MVS too.
I think it cost them about 40% in overhead, but they could stop maintaining
DOS/360 (which IBM dropped 10 years ago and Amdahl had no interest in).
It got to be a pain writing new disk drivers and machine check handlers
to keep up with the latest in IBM and 3rd party mainframe fashions.

gnu@l5.uucp (John Gilmore) (10/31/85)

One thing you vmunix users might try is running "top" (show top N processes
and assorted stats about the system) to answer this question.

Almost all the processes have about a third of their address space paged in.
Is an MMU cheaper than three times as much memory?  Sure, especially if
you are using it for memory protection anyway.
--
What is the sound of one disk paging?

omondi@unc.UUCP (Amos Omondi) (11/01/85)

> One thing you vmunix users might try is running "top" (show top N processes
> and assorted stats about the system) to answer this question.
> 
> Almost all the processes have about a third of their address space paged in.
> Is an MMU cheaper than three times as much memory?  Sure, especially if
> you are using it for memory protection anyway.
> --
> What is the sound of one disk paging?


An interesting question is one of how cheap an MMU is. The address
translation hardware on the Cyber 203 & 205, for example, is rela-
tively fancy. Similarly the occasional attempt to deal with inter-
nal fragmentation by having 3 or 4 page sizes means the translation
hardware is not going to be particularly simple, particularly when
software rather than hardware is used to load the registers. Machine
designers have also been reluctant to provide more than 8 or 16
registers for translation on the grounds that the increase in perfoma-
ance does not justify the cost.

mat@amdahl.UUCP (Mike Taylor) (11/02/85)

> Sharp used to run DOS/360 (heavily hacked) on their Amdahl, since their
> APL time sharing system was written to timeshare efficiently on the 360
> which did not have address translation or paging (remember those days??).
> They ported their APL to MVS a few years ago because a lot of customers wanted
> to run it that way, and eventually converted their data centre to MVS too.
> I think it cost them about 40% in overhead, but they could stop maintaining
> DOS/360

I think it is worthwhile to this discussion to remember just how
the DOS version of Sharp APL worked. In memory, there were a number
of slots that could contain workspaces. Workspaces
were swapped to and from disk into these slots.  In order that a damaged
workspace could not cause the interpreter to run amok and damage other
workspaces, storage protect keys were used.  Storage protect keys are
a S/370 architectural feature which matches a nibble in the PSW to
a nibble associated with each 2K block of storage. To "run" a workspace,
the APL interpreter used the SSK instruction to set the workspaces storage
to a "working" protect key, and then ran under that protect key. When the
timeslot was done, it then reset the storage to a non-working key so
that it was protected.  Needless to say, Sharp discovered that this was
hugely expensive on a cached, pipelined machine.  SSK causes storage
access serialization, as well as nasty cache consequences.
They then changed to a technique which used all available protect keys
to minimize the number of actual SSKs, but finally solved the problem
by going to virtual memory.
-- 
Mike Taylor                        ...!{ihnp4,hplabs,amd,sun}!amdahl!mat

[ This may not reflect my opinion, let alone anyone else's.  ]

gnu@l5.uucp (John Gilmore) (11/05/85)

In article <2184@amdahl.UUCP>, mat@amdahl.UUCP (Mike Taylor) writes:
> (In Sharp APL under DOS...)                      In order that a damaged
> workspace could not cause the interpreter to run amok and damage other
> workspaces, storage protect keys were used.  Storage protect keys are
> a S/370 architectural feature which matches a nibble in the PSW to
> a nibble associated with each 2K block of storage. To "run" a workspace,
> the APL interpreter used the SSK instruction to set the workspaces storage
> to a "working" protect key, and then ran under that protect key.
> They then changed to a technique which used all available protect keys
> to minimize the number of actual SSKs, but finally solved the problem
> by going to virtual memory.

My information is that they ran their production system with all
workspaces in the same protect key, so they never had to do SSK, since
their Amdahl mainframes were so slow at that particular instruction,
and since the production code was sufficiently bug-free that it could
be trusted not to damage other users' workspaces.  (Recall that in the
previous system, such a bug would have caused a protection fault, which
is reported to the console.  They knew exactly how many of these a week
would occur, typically zero.)  The move to virtual memory (DOS->MVS)
was definitely NOT done to speed up SSK instructions; the system ran
much slower after the change.

I don't see how changing a page table entry on an Amdahl can be any
less expensive than changing a storage protect key (since they alter
the protection of the same region of storage) but maybe they thought
"nobody uses SSK much, let's not put much work into it".  The same
thing happend with the 360 "ex" (execute) instruction, which is pretty
heavily used by APL.  This causes an instruction pipe flush since the
early pipe stages are not smart enough to notice that ex is like a
1-instruction branch.  (Amdahls DO optimize branches.)

So maybe the answer to "Why virtual memory" is "Because the hardware
designers optimized for it rather than other equally viable techniques".

jlg@lanl.ARPA (11/19/85)

Someone made the point that paging causes a gradual degradation in
performance for a code that couldn't fit in memory.  On modern machines
this degradation would not be so gradual.  On a Cray, for example, the
minimum time to transfer one sector from disk (512 64-bit words at 35.4
Mbits) is about a millisecond.  This does not count the average of half
a disk revolution wait or the operatine system overhead.  This amount of
time corresponds to about 100,000 clock cycles.  With several
independent vector units, 100,000 clocks represents a LOT of idle time.
A job that can't fit into memory would probably not be run in a virtual
environment either because of the degradation caused by paging
(production jobs on Crays currently run for several hours to several
tens of hours as it is).  Programmers for Crays and other non-paging
large machines get around the restricted memory with asyncronous I/O.
The last submitter to this discussion complained that he didn't want to
have to do data paging himself - but who else knows what data will next
be needed if the author of the code doesn't?

john@anasazi.UUCP (John Moore) (11/20/85)

In article <33540@lanl.ARPA> jlg@lanl.ARPA writes:
>Someone made the point that paging causes a gradual degradation in
>performance for a code that couldn't fit in memory.  On modern machines
>this degradation would not be so gradual.  On a Cray, for example, the
	It depends on what you mean by a modern machine and what you are
using it for. If your Cray, or whatever, is doing timesharing, the virtual
memory may mean the difference between being able to run 5 users (or whatever)
quickly, 20 users with adequate response (by paging) or 20 users with 
intolerable response (if you are forced to SWAP entire jobs in order
to let others run). Also, if the objective is to get maximum utilization
out of an expensive machine, you do that by keeping the processor AND
other devices busy, allowing you to do more total work. Imagine your
machine with two large jobs in it - the sum of which slightly exceed
the memory requirement, and which do some I/O (not purely CPU bound). Without
VM, you would probably just run one of them, as you would waste a lot
of time swapping one in and out. While that one job is running, it is
(for most programming languages) either doing I/O or processing, but not
both. With both jobs running, the CPU is busy close to 100% of the time
doing something useful, because the I/O is in parallel with the
processing. This assumes, of course, that you don't use too small
a working set which would result in unaccpetable paging thrashing.
	Another value of virtual memory (or at least page mapping hardware)
is that it allows a system running multiple jobs which are constantly
coming and going and changing size to do so without swapping or memory
reorganization. If you need 40K and you don't have that much contiguous
memory, who cares -- just map 40K of pages from wherever you find them
into a contiguous address space.
	Continuing with the possible uses of page mapping... In my business
we work very heavily with communicating processes - for data communications
and transaction processing. A lot of overhead can be saved in interprocess
communications by remapping a page with a message in it from one process
to another. The alternatives are to either have the operating system
copy the message (ugh) or to let all such processes access all of memory
(frightening, huh?). Likewise, common pure procedure segments can be
shared, and dynamically attached and released.
	So.... For my dream machine, give me page mapping hardware (with
VM capability - dirty bits, page faults, and the like). Then give me so
much memory that I almost never take a page fault. Finally (the hardest
part) give me an operating system with the smarts to use all of thise
properly.


-- 
John Moore (NJ7E/XE1HDO)
{decvax|ihnp4|hao}!noao!terak!anasazi!john
{hao!noao|decvax|ihnp4|seismo}!terak!anasazi!john
(602) 952-8205 (day or evening)
5302 E. Lafayette Blvd, Phoenix, Az, 85018 (home address)