[net.arch] Very large memories

ian@cbosgd.UUCP (Neil Kirby) (08/29/86)

	A quick and dirty way to put the decoding delay into perspective is
to think about the decoding delays back in the days when people thought about
building 1M of ram using 1K chips.  Using the same amount of decoding logic
you can build 1G out 1M chips.  Not only that, but the speeds of the decoders
and the memories both have improved since then.  Now I don't know if anyone
actually built 1M memories out of 1K chips, I was in high school then.  The
solution then is likely to still be valid (even if it was to wait for chips
to get bigger).

	Paging the 1G of memory IS a potential problem.  Most pagers run in
the order of n squared where n is the number of pages.  Using the same or
nearly the same page size and gigabyte memories instead of megabyte memories
gives a slowdown factor of about a million.  Even an order n pager takes a
slowdown factor of a thousand.  Hardware help and/or new algorithms will be
needed.  

		Neil Kirby
		...cbosgd!ian

srb@ccvaxa.UUCP (09/01/86)

---
Re: recent discussions on the subject of very large memories.

Saying that "you only need as much memory as you can clear in
a second (or some similar measure)" implicitly assumes lots of
things:
    1)	You're doing general computation, and filling the
	memory with code, miscellaneous variables, tables, etc.
	"like people have always done".  Very parochial thinking.
    2)	The access pattern frequencies to memory are reasonably near
	exponential.  I.e., a minority of locations is accessed
	most frequently, with increasingly smaller frequency of
	access to an increasingly large fraction of the memory
	up to a cumulative (100% - epsilon).
    3)	The total time to access the (100%-epsilon) fraction
	of memory is large compared to the time it takes to acquire
	the epsilon from a backing store, and/or some other useful
	work can be done while the backing store is accessed..

It is not hard to think of real applications where one or more of
the above implicit assumptions will be violated.  Large
data bases, tables of pre-computed cryptographic or scientific
functions, real-time tasks where access cannot be predicted but
the data is needed in real-time, and main store backed only by
a slow-seek-time laser disk come to mind.  I'm sure people like
Hector Garcia-Molina can list some more of these for us.

Something a very clever fellow then at Xerox PARC (Thacker) once
said about CPU performance  seems also to hold for memory.
Paraphrasing,
    Sometimes you just plain need it (the above list, for example),
    and sometimes you can use it to substitute for programmer
    effort and implement using simpler paradigms.
An example: virtual memory.  Consider a machine which is running
lots of jobs, and they won't all fit into physical memory.
Technically, you can generally use clever paging strategies to make
the overall time lost to paging appear small.  But if you could just
put all the jobs into memory, you wouldn't even have to HAVE a
paging strategy.  You don't have to be at all clever to get really
good system response and low cost of doing context switches between
processes.  I'd rather be programming my applications on a machine
like the latter, personally.  Programmer time is getting more expensive
and memory cheaper.  No matter how you scale them, it appears that
those two cost curves are eventually going to cross...

Steve Bunch	    Gould CSD-Urbana.    USEnet:  ihnp4!uiucdcs!ccvaxa!srb
1101 E. University, Urbana, IL 61801    ARPAnet: srb@gswd-vms

["The explanatory value of human laziness is often neglected in the study
of the history of man."]
------------------

rro@lll-lcc.UUCP (Rodney Oldehoeft) (09/03/86)

Sacrifice line.

At an ARO workshop last spring in South Carolina, I got the drift
of a conversation among architects (names not noted) that went
something like this:

"Memory costs are getting cheaper faster than disks are getting
faster.  So large memories are advantageous.  However, since large
memories with fixed page size result in large page tables, larger page
size is a win."

This is a faintly heretical conclusion, what with recent trends in page
size, and the notion that a set of small pages more nearly
approximates the actual working set of a process.

Maybe the ``approximate constant'' being sought is the fraction of
virtual space embodied in a single page.

mat@amdahl.UUCP (Mike Taylor) (09/03/86)

Just a few comments on what the large 370 world does about some of the
problems that have been raised with very large memories.  First of all,
large memories in this world are always multi-ported and accessed
through caches. The addressing granularity isn't the byte - it is the
cache line size. Accesses are always made on a line boundary, for I/O
or CPU cache. This simplifies the addressing problem somewhat. It
takes about 15 cycles to service a line fault (15 ns. cycles).
Writes are generally done in background, and both instruction and
operands are prefetched.
Pipeline buffers allow other pipeline stages to keep running if they
can.
A further simplification is third level mainstore - so-called expanded
store - which is only page-addressable. A whole page is copied in and out
of mainstore at a time. This is really a synchronous paging device - such
an operation only takes about 20 microseconds, though, and saves a lot
of overhead associated with doing an I/O.

Some machines (ours, specifically) allow the machine to be subdivided
into hardware VM's, but only a few (4).

2000 user systems (well, almost) are feasible in these environments without
any extreme measures.
-- 
Mike Taylor                        ...!{ihnp4,hplabs,amd,sun}!amdahl!mat

[ This may not reflect my opinion, let alone anyone else's.  ]

jlg@lanl.ARPA (Jim Giles) (09/06/86)

In article <5100120@ccvaxa> srb@ccvaxa.UUCP writes:
>
>Saying that "you only need as much memory as you can clear in
>a second (or some similar measure)" implicitly assumes lots of
>things:
>....
>    2)	The access pattern frequencies to memory are reasonably near
>	exponential.  I.e., a minority of locations is accessed
>	most frequently, with increasingly smaller frequency of
>	access to an increasingly large fraction of the memory
>	up to a cumulative (100% - epsilon).
>...
>It is not hard to think of real applications where one or more of
>the above implicit assumptions will be violated.  Large
>data bases, tables of pre-computed cryptographic or scientific
>functions, real-time tasks where access cannot be predicted but
>the data is needed in real-time, and main store backed only by
>a slow-seek-time laser disk come to mind.  I'm sure people like
>Hector Garcia-Molina can list some more of these for us.

Most scientific codes I'm familiar with violate this second assumption.
The entire memory requirement of a Hydro code for example is referenced
with equal frequency in a cyclical manner (once per time-step).

J. Giles
Los Alamos

stubbs@ncr-sd.UUCP (Jan Stubbs) (09/09/86)

The issue of virtual memory usage on large memories keeps coming up,
so I felt compelled to propose the following as the reasons for
using virtual memory in a computer system:

1) The program doesn't fit in the memory, or the job mix doesn't fit
but the working set does.  This reason may go away as memory sizes
increase, unless applications sizes increase at a faster rate, or cpu
powers increase such that more jobs or tasks need to be in memory
to keep the cpu busy.

2) Memory management hardware usually provides memory protection
and automatic relocation (which may improve load time) but these 
features are available without virtual memory.

3) The real reason for virtual memory (and the one which won't
away when memories get big) is that I can quickly load the
working set of a large program, execute it, and go on to the next
program, while a real memory system might be still loading its program.

In fact, I can start executing when the first page gets in, on
a properly implemented system where everything is already
in paging format.

This assumes that the working set is smaller than the program,
and that the programmer didn't do some work to only load those
pieces of program and data which were needed this time.

This kind of programmer work is probably non-productive
especially since hardware (and software LRU algorithms)
can do it better. 

Another case peculiar to Unix systems is VFORK. Why duplicate
the whole address space when the most likely thing to do next
is EXEC, which will clobber the space you just copied?

In summary, my guess is that we will continue to pay
the cost of virtual memory overhead for a long time
yet, for the majority of applications.


Jan Stubbs    ....sdcsvax!ncr-sd!stubbs
619 485-3052
NCR Corporation
16550 W. Bernardo Drive MS4010
San Diego, CA. 92127

joel@peora.UUCP (Joel Upchurch) (09/11/86)

>Jan Stubbs    ....sdcsvax!ncr-sd!stubbs
>3) The real reason for virtual memory (and the one which won't
>away when memories get big) is that I can quickly load the
>working set of a large program, execute it, and go on to the next
>program, while a real memory system might be still loading its program.
>
>In fact, I can start executing when the first page gets in, on
>a properly implemented system where everything is already
>in paging format.
>
>This assumes that the working set is smaller than the program,
>and that the programmer didn't do some work to only load those
>pieces of program and data which were needed this time.

	I may be missing something important, but this doesn't
	sound right to me. I followed what Jan said about starting
	up faster (more about that later), but it seems to me that
	you'll use up that and a lot more in page faults. Since you
	are reading the program piecemeal into virtual memory you
	are going to be a lot slower because of the extra seek and
	rotational delays. It is sort of like a guy driving on a
	surface street, instead of going over and getting on the
	freeway. You get started faster, but you hit all those
	traffic lights. The only case that is valid is if the
	overwhelming majority of the pages in the program are
	never referenced.

	Another consideration, is that on at least some virtual
	memory systems I've worked with, the the system transfers
	the program over to some sort of paging file, before it
	starts, which means that you end up reading the whole
	program in anyway, and writing it out to boot. Does anyone
	know of a virtual memory system that doesn't work this
	way?

	I'm no big fan of virtual memory. It made sense back in the
	days of timesharing systems with expensive core memories,
	but now with cheap semiconductor memory RAM disks and big
	disk caches make more sense.

	I worked at one large IBM mainframe site that installed
	a external RAM disk drive on their MVS system. Do you
	know what they used the RAM disk for? Page datasets!
	In effect they were using RAM to simulate RAM! And the
	worse part was it made sense since the IBM machine could
	only use 16MB of real memory.
-- 
     Joel Upchurch @ CONCURRENT Computer Corporation (A Perkin-Elmer Company)
     Southern Development Center
     2486 Sand Lake Road/ Orlando, Florida 32809/ (305)850-1031
     {decvax!ucf-cs, ihnp4!pesnta, vax135!petsd, akgua!codas}!peora!joel

henry@utzoo.UUCP (Henry Spencer) (09/11/86)

> Another case peculiar to Unix systems is VFORK. Why duplicate
> the whole address space when the most likely thing to do next
> is EXEC, which will clobber the space you just copied?

Please, not "to Unix systems" but "to Berklix systems".  The vast majority
of Unix systems do not have vfork.  Even a lot of Unix systems with well-
designed virtual-memory systems lack vfork:  there is no intrinsic reason
why a plain, ordinary fork has to copy the whole address space.  It has to
*look* that way to the program, but with a modicum of cooperation from the
memory-management hardware, that's easy enough.  Even the Berklix manuals
warn you that vfork is a temporary kludge around implementation difficulties,
not a permanent part of the system.
-- 
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,decvax,pyramid}!utzoo!henry

chris@umcp-cs.UUCP (Chris Torek) (09/13/86)

>In article <1164@ncr-sd.UUCP> Jan Stubbs writes:
>>3) The real reason for virtual memory (and the one which won't
>>away when memories get big) is that I can quickly load [a]
>>working set of a large program ...

In article <2383@peora.UUCP> joel@peora.UUCP (Joel Upchurch) replies:
>... but it seems to me that you'll use up [the faster startup] and
>a lot more in page faults.  Since you are reading the program
>piecemeal into virtual memory you are going to be a lot slower
>because of the extra seek and rotational delays. It is sort of like
>a guy driving on a surface street, instead of going over and getting
>on the freeway. You get started faster, but you hit all those
>traffic lights. The only case that is valid is if the overwhelming
>majority of the pages in the program are never referenced.

This analogy is flawed.  This would be true if one other thing were
also true.  The delay produced by paging has to be noticeable when
compared to other delays in the system.  The primary delay in the
system is likely to be the user, and his delay is likely to be on
the order of seconds, not milliseconds: thus the paging is lost in
the noise.

This depends, of course, on the exact characteristics of the system.
If nothing can be displayed until the entire program is loaded,
that program should not be paged in piecemeal.  This is exactly
what the BSD NMAGIC format is for.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 1516)
UUCP:	seismo!umcp-cs!chris
CSNet:	chris@umcp-cs		ARPA:	chris@mimsy.umd.edu

bogstad@brl-smoke.ARPA (William Bogstad ) (09/13/86)

In article <2383@peora.UUCP> joel@peora.UUCP (Joel Upchurch) writes:
[lines deleted ]
>	Another consideration, is that on at least some virtual
>	memory systems I've worked with, the the system transfers
>	the program over to some sort of paging file, before it
>	starts, which means that you end up reading the whole
>	program in anyway, and writing it out to boot. Does anyone
>	know of a virtual memory system that doesn't work this
>	way?

	Yes, it is my understanding that the Ridge 32 systems running
ROS page directly from the executable file.  The block size on disk is
4096 bytes and there is a way (although somewhat painful) to make files
contiguous which avoids some of the obvious problems with disk access
times.  ROS looks like System V unix from the system call level, but
internally is a message passing system consisting of multiple processes.
All disk i/o goes through the virtual memory manager.  When reading and
writing a file you are in fact passing messages to a simple process
whose data space is the contents of the file.  Overall, this method
doesn't work too badly.

				Bill Bogstad
				bogstad@hopkins-eecs-bravo.arpa
				bogstad@brl-smoke.arpa

Disclaimer:	I do not work for Ridge and have never done so.
    I am, however, the moderator for the ARPANET INFO-RIDGE mailing list.

tuba@ur-tut.UUCP (Jon Krueger) (09/14/86)

In article <2383@peora.UUCP> joel@peora.UUCP (Joel Upchurch) writes:
>	. . . it seems to me that you'll use up [a particular performance
>	advantage of virtual memory] and a lot more in page faults. Since you
>	are reading the program piecemeal into virtual memory you are going to
>	be a lot slower because of the extra seek and rotational delays. It is
>	sort of like a guy driving on a surface street, instead of going over
>	and getting on the freeway. You get started faster, but you hit all
>	those traffic lights. The only case that is valid is if the
>	overwhelming majority of the pages in the program are never referenced.

In terms of your analogy, if you are driving from A to B, it sure is faster
to take the freeway.  If you know in advance that you'll always want to start
at A, stop at B, and hit all points in between, you neither need nor want
local roads.  But what if make trips where you decide to make additional
stops along the way?  What if you encounter conditions along the way that
make continuing to B unnecessary?  For instance, you discover that the
foobars you usually pickup at B are available halfway to B.  Or you hear the
stock report on your car radio tell you that the foobar market has just
collapsed.

Better still, what if we gave you a road system that let you skip
roads that don't contain any of your destinations?  Then you'd find it
faster to take local roads.  Frequently you'd never get to B.  What
you would need is fast access to roads containing or adjacent to
destinations and decision points.  All the studies on typical memory
referencing tell us that code behaves this way, and needs this kind of
support to behave quickly.

>	Another consideration, is that on at least some virtual
>	memory systems I've worked with, the the system transfers
>	the program over to some sort of paging file, before it
>	starts, which means that you end up reading the whole
>	program in anyway, and writing it out to boot. Does anyone
>	know of a virtual memory system that doesn't work this
>	way?

VMS.  I'm not familiar enough with vmunix schemes, but I suspect they
provide similar provisions.  How VMS behaves: your very first
instruction is started with no part of your code or data resident.
Naturally this causes a page fault.  VMS faults in your first cluster
of pages from disk to physical memory.  Your instruction is restarted
and your image stays computable until a reference to another
non-resident address causes another page fault.  Your image continues
executing indefinitely until (1) terminating (normally, interrupted by
user, or fatal error condition forces image exit), (2) referencing
another non-resident address causes another page fault, which again
briefly interrupts execution, or (3) misc events like i/o or
hibernation/suspension occur that pause image execution.

Now.  Your image can grow through continued faulting until it eats all
available physical memory.  So VMS establishes limits for each process
called physical memory quotas.  Your image grows up to its process's limit.
Then, when handling the first fault that finds your image at its limit, VMS
gives your image its next cluster, but in return, VMS takes away a cluster
of your image's least-recently-used resident pages.  If your image modified
them ("dirty pages"), yes, VMS writes them to a paging file.  But note the
difference from the paging scheme you describe: the whole image is NOT
written to a paging file before starting execution.  In fact, images
commonly run and terminate without exceeding their physical quotas.  Images
that do commonly find that their least-recently-used pages were never
written to (initialization code, typically).  So the paging file is there,
but its function is to transparently handle a few cases.

Two final clevernesses: you might think that if your image needs its
modified page back from the paging file, another disk access is required.
VMS keeps track of whether another process used your dirty page between the
time you lost it and the time you needed it again.  If not, VMS maps the
physical page back into your space, no disk read required, not even a page
copy in memory, cost of writing a few bytes of page table.  Finally, VMS
avoids writing to the paging file until all processes have created a total
of more dirty pages than a system-wide limit.  So if other process's demand
for memory is low, VMS doesn't write to the paging file until your image
grows to the total of your process's physical memory quota and the system's
dirty page limit.

>	I'm no big fan of virtual memory. It made sense back in the
>	days of timesharing systems with expensive core memories,
>	but now with cheap semiconductor memory RAM disks and big
>	disk caches make more sense.

Virtual memory is a series of tradeoffs best described as "getting
more out of less".  There will continue to be a need to do this
indefinitely.  Yes, memory gets cheaper, but programs get larger even
faster.  The demands of applications that benefit from or actually
require more space will outpace the semiconductor industry's ability
to provide physical memory at a given cost.  It will always be
convenient or necessary for some applications on some hardware to
permit code to execute in spaces larger than current or available
physical memory.  For instance, my concept of a high-performance
workstation involves a lot of processes executing on my behalf on my
own local processor.  The above scheme is said to let them share an
arbitrary sized pool of local physical memory with low memory
contention overhead, high flexibility to meet changing memory usage
over time, good equity, and few pathological cases.  So I think I want
a similar scheme on a single-user machine with large, cheap, memory.

To return to your analogy, if you want a fast way to get from A to B
you take the highway.  If cost is no factor, it would be faster yet to
get those other cars off the road!  They're just in your way, and
require endless delays for safety rules that are only required to
permit multiple cars safe use of a shared resource, the highway
system.  Now think how limited the performance is of driving your
individual car around.  You can only go one place at a time.  While
you're loading or unloading the car you're not actually getting
anywhere.  When you do have to make a long haul trip you're out of
town for days at a time and miss opportunities otherewise available in
a short trip.  Your car is only so versatile, and can't be great for
moving people and also for luggage.  Wouldn't it be better, to get
high performance, to buy a couple of cars, or a couple dozen, if their
performance doubles and price halves every year, and their drivers,
and keep them moving all over the interstate system on your behalf?
Sure it would, but we'd all run out of major, minor, shared, and
private roads in short order.  Virtual memory, in this analogy, is
getting by on the amount of roads we can afford this year by paving
only the roads in use.

>	I worked at one large IBM mainframe site that installed
>	a external RAM disk drive on their MVS system. Do you
>	know what they used the RAM disk for? Page datasets!
>	In effect they were using RAM to simulate RAM! And the
>	worse part was it made sense since the IBM machine could
>	only use 16MB of real memory.

It sounds like the hardware support and evolved system software
environment provided spaces too small for applications.  So it doesn't
matter whether the spaces were physical or virtual, they were too
small.  So they resorted to kludges just as the one you mention (in
fact, the same was done at the U of R.  Maybe it still is).  This
kludge is pretty ugly, since physical memory will go wasted both in
primary memory and RAM disks, since neither "sees" the other except
through the processor.  The processor thinks it has 16MB of memory and
maybe another 16MB of fast disk.  Paging on behalf of processes larger
than 16MB will consist of page copies between the two.  This is
cheaper than disk speed transfers, but expensive compared to byte
stores into page tables.  Especially since the byte stores won't
happen until really necessary, based on runtime conditions of image
memory behavior and system-wide process memory demand.

In summary, Joel, I think you've seen some inferior virtual memory
schemes, and the entire concept of virtual memory has gotten a bad
name with you.  Good schemes cost less and deliver more.  They solve
problems now, and we'll continue to need them to solve similar
problems in the future.  They will be the same kinds of "getting more
out of less" problems.  But they'll reappear in different forms.

This posting got pretty long, so some disclaimers: if I've exaggerated
or denigrated or misrepresented VMS, DEC's not responsible.  For
extending Joel's analogy and the shakiness of my analogies, I alone am
responsible.  In case all net.arch poeple knew all the stuff I
presented, I guess I've wasted your time and I apologize.  To people
who hate VMS, I apologize for liking some aspects of it.  Anyway, I
suspect that the good things that DEC coded into VMS in BLISS and
MACRO would work just fine in 4.n BSD in C.


-- 
--> Jon Krueger
uucp: {seismo, allegra, decvax, cmcl2, topaz, harvard}!rochester!ur-tut!tuba
Phone: (716) 275-2811 work, 235-1495 home	BITNET: TUBA@UORDBV
USMAIL:  Taylor Hall, University of Rochester, Rochester NY  14627 

reuel@mips.UUCP (Reuel Nash) (09/14/86)

In article <2383@peora.UUCP> joel@peora.UUCP (Joel Upchurch) writes:
>
>	Another consideration, is that on at least some virtual
>	memory systems I've worked with, the the system transfers
>	the program over to some sort of paging file, before it
>	starts, which means that you end up reading the whole
>	program in anyway, and writing it out to boot. Does anyone
>	know of a virtual memory system that doesn't work this
>	way?
>

Yes! AT&T UNIX System V (at least the demand paging versions).

In the default case (413 magic number -- demand loaded file &
no sticky-bit set) System V demand-pages the program directly 
from the file system. Text (code) pages and unmodified data pages
never get written to any paging device. 

Of course, one reason for all this is that this system runs
on 3b2 computers with small disks and this scheme reduces the 
amount of swap space required.

System V also effectively uses all available main memory pages 
as a cache for the text and data pages on disk (separate from
the file system buffer cache).  This means, for example, that
on a large single-user workstation, when you compile a program the 
first time the compiler must be loaded from the disk.  But, if you
edit and re-compile, the compiler's pages are still in memory and
will be re-used without going to the disk. 

Reuel Nash
MIPS Computer Systems, Inc.
930 E. Arques 
Sunnyvale, CA 94086
408-720-1700 x244
decwrl!mips!reuel

#include <std.disclaimer>

henry@utzoo.UUCP (Henry Spencer) (09/14/86)

> >	...which means that you end up reading the whole
> >	program in anyway, and writing it out to boot. Does anyone
> >	know of a virtual memory system that doesn't work this way?
> 
> VMS.  I'm not familiar enough with vmunix schemes, but I suspect they
> provide similar provisions...  [...details...]

Yes, it's much the same in most virtual-memory Unixes.  No decent virtual-
memory implementation will insist on reading the whole thing in first,
unless perhaps the program is too small for demand paging to be sensible.
Most of the general techniques you mention can be found in any good
virtual-memory Unix.  Details and policies differ.
-- 
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,decvax,pyramid}!utzoo!henry

guy@sun.uucp (Guy Harris) (09/14/86)

> >
> >	Another consideration, is that on at least some virtual
> >	memory systems I've worked with, the the system transfers
> >	the program over to some sort of paging file, before it
> >	starts, which means that you end up reading the whole
> >	program in anyway, and writing it out to boot. Does anyone
> >	know of a virtual memory system that doesn't work this
> >	way?
> >
> 
> Yes! AT&T UNIX System V (at least the demand paging versions).
> 
> In the default case (413 magic number -- demand loaded file &
> no sticky-bit set) System V demand-pages the program directly 
> from the file system. Text (code) pages and unmodified data pages
> never get written to any paging device.

Furthermore, the 4BSD VM system, while (for a 413 magic number) it does
write text pages out to the swap device if it needs their page frame for
another page and that page hasn't already been written to the swap device,
it will not copy the *entire* program to the swap device.  I'd say that
covers 90-95% of the paging UNIX systems out there, as they either 1)
started out as an S5 paging release, 2) started out as 4BSD, or 3) started
out as some non-paging UNIX release and had the 4BSD paging code added.
Further, I suspect few, if any, future VM systems for UNIX will copy the
*entire* image to a paging file when the image is run; it kind of defeats
the whole purpose of making a big image demand-load.

> System V also effectively uses all available main memory pages 
> as a cache for the text and data pages on disk (separate from
> the file system buffer cache).  This means, for example, that
> on a large single-user workstation, when you compile a program the 
> first time the compiler must be loaded from the disk.  But, if you
> edit and re-compile, the compiler's pages are still in memory and
> will be re-used without going to the disk. 

As does the 4BSD VM system, and as most other UNIX VM systems are likely to
do, so this is also pretty much universal among paging UNIX systems.

(P.S. Why do people insist on sticking "AT&T" in front of "System V", or in
front of "UNIX" in general?  Most UNIXes out there started out, at least, as
AT&T releases, so it doesn't really convey any useful information.)
-- 
	Guy Harris
	{ihnp4, decvax, seismo, decwrl, ...}!sun!guy
	guy@sun.com (or guy@sun.arpa)

stubbs@ncr-sd.UUCP (Jan Stubbs) (09/16/86)

In article <2383@peora.UUCP> joel@peora.UUCP (Joel Upchurch) writes:
>
>>Jan Stubbs    ....sdcsvax!ncr-sd!stubbs
>>3) The real reason for virtual memory (and the one which won't
>>away when memories get big) is that I can quickly load the
>>working set of a large program, execute it, and go on to the next
>>program, while a real memory system might be still loading its program.
>>
>>In fact, I can start executing when the first page gets in, on
>>a properly implemented system where everything is already
>>in paging format.
>>
>>This assumes that the working set is smaller than the program,
>>and that the programmer didn't do some work to only load those
>>pieces of program and data which were needed this time.
>
>	I may be missing something important, but this doesn't
>	sound right to me. I followed what Jan said about starting
>	up faster (more about that later), but it seems to me that
>	you'll use up that and a lot more in page faults. Since you
>	are reading the program piecemeal into virtual memory you
>	are going to be a lot slower because of the extra seek and
>	rotational delays. It is sort of like a guy driving on a
>	surface street, instead of going over and getting on the
>	freeway. You get started faster, but you hit all those
>	traffic lights. The only case that is valid is if the
>	overwhelming majority of the pages in the program are
>	never referenced.

Exactly, and I contend that this is often the case. Take an ADA
compiler for example, maybe most programs won't use but a small
subset of the constructs available, then the code to handle
those constructs won't need to be paged in. Your freeway analogy
is perfect - most trips are short, so take the direct route
by surface street.
>
>	Another consideration, is that on at least some virtual
>	memory systems I've worked with, the the system transfers
>	the program over to some sort of paging file, before it
>	starts, which means that you end up reading the whole
>	program in anyway, and writing it out to boot.

Yes, but maybe you only load the page file once (kind of like
setting the sticky bit in a UNIX swap file). 
>

joel@peora.UUCP (09/22/86)

>UUCP:   seismo!umcp-cs!chris
>This analogy is flawed.  This would be true if one other thing were
>also true.  The delay produced by paging has to be noticeable when
>compared to other delays in the system.  The primary delay in the
>system is likely to be the user, and his delay is likely to be on
>the order of seconds, not milliseconds: thus the paging is lost in
>the noise.

	While your point has some validity it doesn't seem to invalidate
	my analogy. It still takes longer to get to point A from point
	B, you just mean that the guy rather get started quicker and hit
	some lights on the way, than wait to get on the on-ramp to the
	freeway.

	In this particular case you say that a user rather have a lot
	of fractional second delays while he is running his program,
	rather than say, an additional one second delay at start up
	time. This may be true, although the human factors stuff I've
	read doesn't support it. Just because the user takes several
	seconds to enter something, doesn't mean he allows the computer
	the same leeway. A lot of people, especially those weaned on
	microcomputers, are of the school that any response time that is
	long enough to be perceived is too long. Also consider what
	percentage of the time between when you ask to run, say an
	editor, and the actual prompt was actually spent loading the
	task into memory, and what was spent finding the shell script
	and running it and finding all the necessary files and assigning
	them etc. I suspect that on most non-trivial tasks these factors
	outweigh the actual task loading time. But, in any case, this
	is outside the the scope of this discussion, or even this
	newsgroup.

	One statement that is puzzling me that several people have made
	in this discussion is that you can run a program with 90%
	percent of its performance with 10% of the memory. When I
	worked as a system programmer on MVS systems the rule of
	thumb we used was that if a programs virtual memory was
	twice the real memory allocated to it, you were going to
	have page thrashing problems. Can someone provide further
	information on the 90/10 rule being quoted. What system and
	class of programs is this rule is applicable to. It seems to
	me that this could color peoples perception of the problem.
	After all there is a big difference in cost between doubling
	the real memory on a machine and adding ten times as much
	memory. Shucks what you save in paging devices could pay
	for doubling the memory. Disks fast enough to make good
	paging devices tend to be expensive.
-- 
     Joel Upchurch @ CONCURRENT Computer Corporation (A Perkin-Elmer Company)
     Southern Development Center
     2486 Sand Lake Road/ Orlando, Florida 32809/ (305)850-1031
     {decvax!ucf-cs, ihnp4!pesnta, vax135!petsd, akgua!codas}!peora!joel

joel@peora.UUCP (09/23/86)

>--> Jon Krueger
>Virtual memory is a series of tradeoffs best described as "getting
>more out of less".  There will continue to be a need to do this
>indefinitely.  Yes, memory gets cheaper, but programs get larger even
>faster.  The demands of applications that benefit from or actually
>require more space will outpace the semiconductor industry's ability
>to provide physical memory at a given cost.  It will always be
>convenient or necessary for some applications on some hardware to
>permit code to execute in spaces larger than current or available
>physical memory.  For instance, my concept of a high-performance

	I don't agree that applications are growing faster than
	memory is getting cheaper. I might agree that the applications
	are growing faster than the amount of memory usually installed
	in most computers.

	Back in 78 it cost me $200 dollars to add 16KB to my Apple II.
	Last month I paid $401 to put 2 more megabytes in PC clone.
	That is a cost decrease of 64 and probably close to 100 if you
	allow for inflation. Are typical applications 100 times bigger
	than they were 8 years ago? I don't think so. 10 times maybe.

	Naturally computer makers expand the memory systems based on
	requirements, rather than memory cost, so the typical memory
	on a computer system has only increased from 64KB to the
	640KB-1MB range over the same period. This has the result of
	making memory a much less significant part of the system
	cost.

	This trend has been less reflected in mini and mainframe
	computers with their longer product lifecycles. One rather
	amusing result of this is that I have 3MB on my home computer,
	and only 4MB on my super-minicomputer at work.

	I agree that virtual memory is a tradeoff, but it my contention
	that tradeoffs that made sense back in the 60s make a lot
	less sense today and will make even less sense tomorrow.

	The cost of disk storage hasn't gone down nearly as fast as
	the cost of memory. Especially the kind of high performance
	drives that make good paging devices. You can have virtual
	memory without fast disks, but then you need more memory
	to maintain equivalent performance levels.

	I will agree that for any given memory N, there will always
	be applications that won't run in N. But as time goes on these
	applications will more and more be oddballs, rather than the
	mainstream. There will niche computers to run them, but the
	common run of applications won't need this kind of support.

	This is not to say that future processors won't support
	virtual memory. Address translation/ memory protection
	support is useful on any multitasking system, with or
	without virtual memory and the incremental cost for
	hardware to generate page faults isn't much. A lot
	of people seem to be excited about the 80386, but not
	because of virtual memory, but because of better support
	for multitasking.
-- 
     Joel Upchurch @ CONCURRENT Computer Corporation (A Perkin-Elmer Company)
     Southern Development Center
     2486 Sand Lake Road/ Orlando, Florida 32809/ (305)850-1031
     {decvax!ucf-cs, ihnp4!pesnta, vax135!petsd, akgua!codas}!peora!joel

tuba@ur-tut.UUCP (Jon Krueger) (09/23/86)

>     Joel Upchurch @ CONCURRENT Computer Corporation (A Perkin-Elmer Company)
joel@peora.UUCP writes:

>	I don't agree that applications are growing faster than
>	memory is getting cheaper. I might agree that the applications
>	are growing faster than the amount of memory usually installed
>	in most computers.
>	Back in 78 it cost me $200 dollars to add 16KB to my Apple II.
>	Last month I paid $401 to put 2 more megabytes in PC clone.

An alternate way of expressing the trend is that over the last eight years,
memory costs have halved every 1 to 2 years.  Bell's law states that
applications grow at the rate of an address bit every two to three years.
Therefore, accepting your figures as accurate, I stand corrected.  The
semiconductor industry has created ever cheaper memory products, and more
than kept up with demand, over the last eight years.  Too bad they haven't
increased the address bits of their processors at the same rate.

>	The cost of disk storage hasn't gone down nearly as fast as
>	the cost of memory. Especially the kind of high performance
>	drives that make good paging devices. You can have virtual
>	memory without fast disks, but then you need more memory
>	to maintain equivalent performance levels.

If your applications are i/o bound by paging, buy more memory.  Tuning
your application on a virtual memory machine gets you more performance
out of less memory, not adequate performance out of inadequate memory.

>	I will agree that for any given memory N, there will always
>	be applications that won't run in N. But as time goes on these
>	applications will more and more be oddballs, rather than the
>	mainstream. There will niche computers to run them, but the
>	common run of applications won't need this kind of support.

About the turn of the century, there was a serious proposal to close the
Patent Office, since everything useful had already been invented.  Let's not
limit tomorrow's application size in today's architectures.  I don't pretend
to predict tomorrow's applications.  But it's a safer bet that they'll
require more memory than to set an upper limit on typical demands.  Bell's
Law has surprising generality and longevity.

>	This is not to say that future processors won't support
>	virtual memory. Address translation/ memory protection
>	support is useful on any multitasking system, with or
>	without virtual memory and the incremental cost for
>	hardware to generate page faults isn't much. A lot
>	of people seem to be excited about the 80386, but not
>	because of virtual memory, but because of better support
>	for multitasking.

Neither.  Imagine the technical excitement of the 80386 if the IBM PC market
didn't exist, had never existed.  Yawn.  Just another pretty chip.

--> Jon Krueger
uucp: {seismo, allegra, decvax, cmcl2, topaz, harvard}!rochester!ur-tut!tuba
Phone: (716) 275-2811 work, 235-1495 home	BITNET: TUBA@UORDBV
USMAIL:  Taylor Hall, University of Rochester, Rochester NY  14627 

-- 
--> Jon Krueger
uucp: {seismo, allegra, decvax, cmcl2, topaz, harvard}!rochester!ur-tut!tuba
Phone: (716) 275-2811 work, 235-1495 home	BITNET: TUBA@UORDBV
USMAIL:  Taylor Hall, University of Rochester, Rochester NY  14627 

jlg@lanl.ARPA (Jim Giles) (09/25/86)

In article <2449@peora.UUCP> joel@peora.UUCP writes:
>	I don't agree that applications are growing faster than
>	memory is getting cheaper. I might agree that the applications
>	are growing faster than the amount of memory usually installed
>	in most computers.
>
>	Back in 78 it cost me $200 dollars to add 16KB to my Apple II.
>	Last month I paid $401 to put 2 more megabytes in PC clone.
>	That is a cost decrease of 64 and probably close to 100 if you
>	allow for inflation. Are typical applications 100 times bigger
>	than they were 8 years ago? I don't think so. 10 times maybe.
>...

Scientific applications tend to grow at a rate of about a factor of
100 every ten years.  This is a figure that everyone in the industry
tends to accept as a 'given'.  I know people with specific applications
that could use a factor of 1000 as soon as tomorrow!  Lattice-guage
theory calculations, for example, use 4 dimensional arrays (lattices).
A factor of 1000 is only a factor of 5 in each of the 4 dimensions.
Think how fast problems grow in the super-symmetry problems that
operate in 10 dimensions - just doubling the lattice in each direction
is a factor of 1024 in the memory requirement.

J. Giles
Los Alamos