[comp.os.misc] virtual memory

pcg@cs.aber.ac.uk (Piercarlo Grandi) (12/05/90)

On 29 Nov 90 14:00:13 GMT, khera@thneed.cs.duke.edu (Vick Khera) said:

khera> In article <59300@microsoft.UUCP> jimad@microsoft.UUCP (Jim
khera> ADCOCK) writes:

jimad> The "huge linear address" of Unix-style machines is a farce in
jimad> the first place, given that that "huge linear address" is built
jimad> of 4K typical pages, which are mapped in unique ways to disks,
jimad> and programmers have to reverse engineer all these aspects to get
jimad> good performance in serious applications.

This happens on non serious systems that try to give the illusion that
VM is just extra, cheap, RAM. Systems designed for VM, like Multics,
give you far better and easier native control on the performance aspects
of VM. Don't take the regrettable state of current Unix VM designs as
typical of what a VM system should be.

khera> you seem to be implying that ...

He is implying that the illusion that virtual memory has the performance
profile of central memory (extremely small and uniform granularity and
latency of access) is totally wrong whether the virtual memory is two
dimensional (segmented) or one dimensional (linear).

Also, object architectures do not benefit from a linear VM, or at least
not as much as other architectures. OO architectures, it can be argued,
are inherently two dimensional in flavour.

khera> ... a segmented architecture is better than a system that implements
khera> virtual memory in a linear address space.

Maybe yes. Probably not, IMNHO, but for reasons that are too long to
explain here, and bear no relationship to yours, and for which most OO
architectures around are based on object-id plus displacement (starting
with the venerable descriptor architecture of the B5500). Jim Adcock has
good reason to argue for the latter, if he chooses to do so.  I think I
have better reasons to argue for linear VM mechanisms even in OO
architectures, but it's not that clear whether they are that much
better.

khera> there is no need to "reverse engineer" all those aspects to get
khera> good performance out of a Unix OS.

Only if you assume unlimited central memory, that is :-). Are you a Lisp
programmer? :-).

khera> [ ... ] i agree that it is beneficial to cluster objects and code
khera> that reference each other in the same pages to reduce the page
khera> faulting in a virtual memory machine.  it is not that crucial,
khera> however,

This is absolutely wrong. Wrong, wrong, wrong. It is crucial for any
application that consumes a significant fraction of available central
memory for its working set (where you want to make sure the working set
does not overflow central memory, so you aim to minimize the number and
size of objects referenced); it is even more crucial for application
whose working set is greater than central memory (where almost all
references will trigger IO, so you want to minimize the number of
references).

There are papers that show that improving the percentage of pointers
that point on page from 95% to 98% reduces the working set by 30%, etc.
The performance profile of virtual memory is highly non linear.

khera> since we don't have to worry about crossing segment boundaries
khera> and the OS will figure out which pages it needs (the working
khera> set).

Yeah, don't worry, be happy. If only such a facile attitude worked in
practice!

Many important OO/C++ applications are consistently IO bound, and one
has to be very careful and cluster objects physically (as Jim Adcock
obliquely implies, not just objects related objects in a page, but
related pages on the same cylinder, ...).

It is true that the operating system will try to *estimate* the working
set, but even if it succeeds, it makes a lot of difference whether the
working set so estimated is X, 10X, or 100X. Also, the pager will
typically not help you with clustering related pages together on disk.

Without a lot of careful effort and what Jim Adcock calls "reverse
engineering", that a lot of people like you do not care about, one is
more likely to get 100X than X, form the same application. Just look at
the catstrophic VM implications of the GNU Emacs buffer management
scheme for a poignant illustration.

As long as the cost of memory is not ZERO, it pays to make some effort
to conserve it. Even more importantly, as the cost of IO operations has
not gone down as fast as other costs, it pays even more to minimize
them.

There is an important and open debate on whether a segmented or linear
system is best for dynamic clustering (there is conclusive evidence that
static clustering is far less effective) of objects. Maybe you should
familiarize yourself with it. I suggest heartily, for a starter, the
book "Smalltalk: bits of history, words of advice".

It contains many lessons for C++ programmers too. We are starting to see
a lot of effort into reimplementing a lot of Smalltalk technology in C++
(and Objective C and Eiffel). Consider LOOM, or conservative garbage
collectors, etc...
--
Piercarlo Grandi                   | ARPA: pcg%uk.ac.aber.cs@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth        | UUCP: ...!mcsun!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk

jimad@microsoft.UUCP (Jim ADCOCK) (12/11/90)

In article <KHERA.90Nov29090013@thneed.cs.duke.edu> khera@thneed.cs.duke.edu (Vick Khera) writes:
|In article <59300@microsoft.UUCP> jimad@microsoft.UUCP (Jim ADCOCK) writes:
|
|   Having used both Unix-style huge linear addresses, and Intel 80x86
|   segments, I believe neither has much in common with OOP.  In either
|   one has to copy objects around, or do manual clustering of objects
|   via heuristics, etc.  Maybe one needs hardware based on an
|   obid+offset, with automatic support of clustering?

|you seem to be implying that a segmented architecture is better than a
|system that implements virtual memory in a linear address space.

No, I was stating that neither traditional segmented architectures nor
linear architectures have much in common with the needs of object 
oriented programming.  Maybe people ought to consider CPUs based on
the needs of OOP?  Yes, I realize there has been abortive attempts at this
in the past, which only means the problem is non-trivial.

|there is no need to "reverse engineer" all those aspects to get good
|performance out of a Unix OS.  The OS has the freedom to figure out
|which parts of the application are needed and have those in memory,
|with the rest loaded in on demand.  This also reduces initial startup
|time since the whole program doesn't need to be loaded into memory
|before it starts to execute.  if the program is sufficiently small and
|the computer's memory is sufficiently large, the whole program can sit
|in memory with no problems or hassles of segments.

I agree that traditional flat address spaces works good for traditional
program code.  So do segmented architectures.  They both fall down in
terms of object-oriented code, and object store.  Flat verses segmented
doesn't impact load on demand, memory residency etc.  These issues are
the same on both architectures.  Both architectures have problems if
if patterns of code usage or memory usage is not constant and well
clustered.

|i agree that it is beneficial to cluster objects and code that
|reference each other in the same pages to reduce the page faulting in
|a virtual memory machine.  it is not that crucial, however, since we
|don't have to worry about crossing segment boundaries and the OS will
|figure out which pages it needs (the working set).

Crossing segment boundaries is not an issue since typical object size is
much smaller than segment size, and thus objects can simply be placed 
by memory allocators so as to not cross segment boundaries.  The ability
to order the addresses of a billion bytes is not terribly important when
the size of a typical object is 100 bytes or so.  And even within that
object, the ability to order the addresses of those 100 bytes is seldom
needed.

|the only reason Intel sticks with the idea is the amount of investment
|they have in it and all those little pee-cee's out there that need
|backward software compatibility.  though this is an admirable goal,
|there comes a point when you just have to cut your losses and go with
|a better idea.

The Intel 80x86 line has supported flat model "Unix" style programming for 
some years, in addition to maintaining segmented support for backwards
compatibility, so you cannot fault Intel on those grounds.  They also
support an interesting 16:32 segmented mode that could be used for
transparent programming across multiple workspaces, tagged pointers, etc.
The fault on these issues is not the Intel hardware, but rather in the
software support that lags -- due, as you point out, to the needs of supporting
a couple million customers out there with software running in segmented
architectures.  One might well wonder when some other old CPU architectures 
will move beyond 0:32 pointers, and in what manner.  Will they move to
0:48 or 0:64 pointers, if so, how will they maintain instruction set
compatibility?  Or will they move to a 16:32 multiple workspace 
"segmented" architecture?  [New CPU designs do not restrict themselves to
a 32 bit address space]

If you want to fault Intel, do so for the pitifully small number of 
registers they put in their machines, or the short-comings of 
their risc CPUs :-)