[comp.lang.lisp] GC triggering and stack limit checking by MMU hardware

torbenm@diku.dk (Torben [gidius Mogensen) (07/19/90)

Most implementations (that I know) of programming languages with
stacks or heaps, use some kind of explicit check to see when a stack
or heap is full. This is used to trigger events such as error
messages, garbage collection or allocation of more memory for a
fragmented stack. Usually, good implementations avoid testing on every
use of the stack / heap, but only once for every basic block or
procedure.

By using the MMU facilities present in most modern computers, these
tests can be entirely done in hardware, and fragmented stacks can be
done transparently. This is the idea:

Give each stack or heap a small number of pages of memory and make
sure that the logical addresses are far apart, and that a large
logical address space is given each, but where only the first small
part is actually mapped to physical (or virtual) memory.

Assign to each stack/heap an event that is invoked by the MMU trap
handler on access to unmapped memory. An event can ask for a new
physical page to be mapped to the corresponding area, or maybe try to
do garbage collection.

Virtual memory can handle the stack problem, as the only sensible
thing to do would be to assign new pages. However, triggering garbage
collection in the way described above is different from just extending
the virtual memory space. Furthermore, the idea is perfectly usable
even on systems with no virtual memory support.

I believe heap intensive languages like LISP, Prolog or functional
languages could benefit from such a strategy, especially if procedures
and basic blocks are short, which is often the case for object code
for these languages.

My question is: has this been done? And if not, why not?

I suspect that operating systems like Unix doesn't provide such
facilities to user processes, but the issue is then: should they?

	Torben Mogensen (torbenm@diku.dk)

grunwald@foobar.colorado.edu (Dirk Grunwald) (07/20/90)

T> Assign to each stack/heap an event that is invoked by the MMU trap
T> handler on access to unmapped memory. An event can ask for a new
T> physical page to be mapped to the corresponding area, or maybe try to
T> do garbage collection.
T> 
T> Virtual memory can handle the stack problem, as the only sensible
T> thing to do would be to assign new pages. However, triggering garbage
T> collection in the way described above is different from just extending
T> the virtual memory space. Furthermore, the idea is perfectly usable
T> even on systems with no virtual memory 
	...
T> 
T> My question is: has this been done? And if not, why not?
	...
T> 	Torben Mogensen (torbenm@diku.dk)
--

Yes, see work by Andrew Appel on just this notion. I know it came out
as DEC-SRC tech report, and was published elsewhere as well. I think
he advocated this for the SML system.

You can do something similar to this in Ultrix and SunOS using the
``mprotect'' call to protect specific pages of memory, and then use a
signal handler for faults/seg violations (whatever error is normally
raised).

I've done this to protect the stack segments for my tasking library;
when creating 1000's of tasks, it's a little wasteful of time and
space (like, 2x slower and 4x larger when using 8K pages), but it
certainly helps locate tasking bugs.

Dirk Grunwald -- Univ. of Colorado at Boulder	(grunwald@foobar.colorado.edu)
						(grunwald@boulder.colorado.edu)

gjc@paradigm.com (07/20/90)

> Most implementations (that I know) of programming languages with
> stacks or heaps, use some kind of explicit check to see when a stack
> or heap is full...
> By using the MMU facilities present in most modern computers, these
> tests can be entirely done in hardware...

The VAX-NIL implementation of lisp, cica 1981, did this. The heap was
easy, but the stack was trickier, requiring a supervisor mode handler
in the case of the MAIN CONTROL stack, (and to be honest, the scheme for 
stack overflow detection was never implemented that way, and instead a
check was made, which only added an instruction to the function call
setup after entry).

But in current processor technology, where you have multi-mip RISC
processors running at many times faster rates than available
memory bandwith you can just initiate the memory operation first,
and then do the bounds check while you are waiting for the memory
operation to complete. The kind of thing you might do in a
microcoded lisp implementation.

More complete MMU implementations that could differentiate between
tagged pointer data references and untagged, and could help keep
track of object volatilities on memory pages would of course really
be a help.

-gjc

hg@ecs.soton.ac.uk (Hugh Glaser) (07/20/90)

Tony Field did something like this on the original Hope/FPM system
(running on VAX/VMS), I guess circa 1985.

jan@swi.psy.uva.nl (Jan Wielemaker) (07/20/90)

In article <1990Jul19.151524.22544@diku.dk> torbenm@diku.dk (Torben [gidius Mogensen) writes:
>
>Most implementations (that I know) of programming languages with
>stacks or heaps, use some kind of explicit check to see when a stack
>or heap is full.
[stuff deleted]
>By using the MMU facilities present in most modern computers, these
>tests can be entirely done in hardware, and fragmented stacks can be
>done transparently. This is the idea:
[stuff deleted]
>My question is: has this been done? And if not, why not?

I implemented this schema for SWI-Prolog.  Richard Tobin of Edinburgh
University noted me at the NACLP in Cleveland last year of the
possibility to reach the MMU under SunOs 4.0 via the mmap() and related
calls.  The schema supports expanding the stacks and triggering the
garbage collector.  It works as follows:

    1) A file of 1 page is created on /tmp and opened for reading.
    2) Computed from the command line options for the stack limits,
       virtual address space is reserved for all the Prolog stacks.
       A 1 page gap is always maintained between the gaps.
    3) On a segmentation violation, the stack causing the problem
       is determined.  On SunOs the address is passed via the signal
       handler; on systems that do not provide this I walk along
       all stacks, comparing the current top with the allocated
       top.  Next, a page is added to the stack and a call to
       considerGarbageCollect() is made.  Afterwards the signal
       handler returns.
    4) considerGarbageCollect() checks whether this is a stack
       that needs garbage collection and how much the stack has
       grown after the last garbage collection on this stack. If
       a garbage collection is appropriate a global flag is set
       to indicate such.
    5) On a save point (the call port for Prolog) the global flag
       indicating a garbage collection request is checked and if
       requested a garbage collection is executed.  After the
       garbage collection all pages above the new stack top are
       unmapped from the address space.

       It is next to impossible to invoke the garbage collector
       direct from the signal handler.  One does not know the
       status of the virtual machine and the stacks are not
       guarantied to be consistent at an arbirary point in time.

The proposed schema is very simple to implement.  It provides
dynamically growing stacks (to some limit, but on most modern machines
this limit can be high as the virtual address space is large).  Expanding
the stacks is almost for free.  On conventional systems a stack shifter
needs to be implemented.  This is tiresome and shifting the stacks is
an expensive operation.

I have seen mmap() and related calls only on SunOs 4.0.  The call maps
at file a specific address in the virtual address space.  SunOs both
supports shared maps (writing in the mapped addresses are forwarded to
the file) and private maps (writing is not forwarded to the file).  For
the dynamic stacks I map the same 1 page file many times using a private
map at different places.

On some system-V Unix machines the same results can be obtained using
shared memory.  In this case the shared memory segments are allocated
and mapped in the stacks.  I implemented this on a GOULD PN9000 under
UTX 2.0?  It works nice on SunOs (as an alternative to mmap()) as well.
On HPUX 6.5 for one or another reason all stacks point to the same
physical memory (the same bug occurs in SunOs 4.0.0 on SUN-3; only on
4.0.1 and later things are ok).  Finally on AIX 2.0 on a PS2 machine
shared memory segments can only be mapped on 4M boundaries, which
implies only a difficult schema can be used: 1) create a new segment,
larger than the old one and map it at an arbitrary place; 2) copy the
contents of the old into the new; 3) deallocate the old; 4) map the new
into the old place and unmap it from the temporary place.  I did not yet
implement this latter schema.

Both schemas are not ideal.  The mmap() requires a file and reads the
file into the page as a page is mapped.  This is a waste of resources; I
just want a page of uninitialised memory and need no file.  The shared
memory solution is not ideal as well because shared memory objects are
global resources in system-V and only a limited number of them is
available to ALL processes.  Also, the process is responsible for
deallocating the resource, otherwise they survive the terminating
process.  It would be ideal to have two simple calls:

    map(base, lenght)
    unmap(base, lenght)

	Jan

schwartz@groucho.cs.psu.edu (Scott E. Schwartz) (07/21/90)

In article <3260@swi.swi.psy.uva.nl> 
 jan@swi.psy.uva.nl (Jan Wielemaker) writes:
| Both schemas are not ideal.  The mmap() requires a file and reads the
| file into the page as a page is mapped.  This is a waste of resources; I
| just want a page of uninitialised memory and need no file.

Rather than mapping a file, you can map /dev/zero, which provides zero
filled copy on write pages.

lkaplan@bbn.com (Larry Kaplan) (07/21/90)

In article <Fhtz5421@cs.psu.edu> schwartz@groucho.cs.psu.edu (Scott E. Schwartz) writes:
>
>In article <3260@swi.swi.psy.uva.nl> 
> jan@swi.psy.uva.nl (Jan Wielemaker) writes:
>| Both schemas are not ideal.  The mmap() requires a file and reads the
>| file into the page as a page is mapped.  This is a waste of resources; I
>| just want a page of uninitialised memory and need no file.
>
>Rather than mapping a file, you can map /dev/zero, which provides zero
>filled copy on write pages.

Ever hear of anonymous mapped memory.  Check out the 4.3BSD Architecture Manual
(PS1:6) for a description of this rarely (or almost never) implemented 
feature.  Using MAP_ANON causes the memory to be zero filled, but not 
necessarily copy on write.  It is more meant for sharing of zero initialized
pages.  The file descriptor is only used for rendezvous.  We've implemented
it here and found it very useful.

#include <std_disclaimer>
_______________________________________________________________________________
				 ____ \ / ____
Laurence S. Kaplan		|    \ 0 /    |		BBN Advanced Computers
lkaplan@bbn.com			 \____|||____/		10 Fawcett St.
(617) 873-2431			  /__/ | \__\		Cambridge, MA  02238

pereira@alice.UUCP (Fernando Pereira) (07/21/90)

In article <1990Jul19.151524.22544@diku.dk> torbenm@diku.dk (Torben [gidius Mogensen) writes:
>
>By using the MMU facilities present in most modern computers, these
>tests can be entirely done in hardware, and fragmented stacks can be
>done transparently. This is the idea:
>
>Give each stack or heap a small number of pages of memory and make
>sure that the logical addresses are far apart, and that a large
>logical address space is given each, but where only the first small
>part is actually mapped to physical (or virtual) memory.
>
>Assign to each stack/heap an event that is invoked by the MMU trap
>handler on access to unmapped memory. An event can ask for a new
>physical page to be mapped to the corresponding area, or maybe try to
>do garbage collection.
> [...]
>My question is: has this been done? And if not, why not?
>
>I suspect that operating systems like Unix doesn't provide such
>facilities to user processes, but the issue is then: should they?
>

I know of Prolog and ML implementers that have tried this, but unfortunately
it is not a portable solution in Unix, even those versions that offer the
mmap system call to create ``islands'' of virtual address space. The problem
is that certain machine architectures (eg. Motorola 68K) and OS
implementations (eg. SunOS at least in some versions) do not provide
a continuable address violation signal (SIGSEGV), even though at the kernel
level, address translation faults (page faults) are continuable. Not having
looked at the insides of those machine/OS combinations, I suspect that
enough instruction execution context can be saved for filling a page fault
in the kernel, but not enough for reentering the faulting process to
handle a Unix signal and allocate the needed storage.

These observations are the result of practical experiments 5 or so years
ago with Sun 2's and VAXen running Berkeley Unix. The former could not
recover correctly from the segmentation violation (PC corrupted on return
from the signal), the latter could. Newer machines/architectures might handle
this better, although comments I have heard from the SML-NJ implementers
indicate that there are problem configurations even now.

Fernando Pereira
2D-447, AT&T Bell Laboratories
600 Mountain Ave, Murray Hill, NJ 07974
pereira@research.att.com

mo@messy.bellcore.com (Michael O'Dell) (07/22/90)

Interlisp did this as a matter of course, and Vax-Interlisp was
the single biggest reason for the originally-published
(as opposed to originally discussed but not published)
4.2BSD virtual memory proposal being page-based, just like Tenex.
This was finally implemented in Mach and SunOS 4.0, and
is forthcoming in 4.4BSD.

While this sounds like a good idea, it may not be on very fast
computers.  Providing user programs with completely general
program state access on segmentation faults can be VERY expensive.
I worked on one design whereby the kernel could do all the
usual stuff (copy-on-write, etc) by "reeffecting" faulting instructions
even in the face of out-of-order instruction completion,  But 
letting the user program poke its nose around as described by the
signal() interface requires "atomic trap behavior", and
that makes a program run MUCH MUCH slower on some machines.
For details, see my paper "Putting UNIX on Fast Computers"
in the latest USENIX Anaheim Conference Proceedings.

	-Mike

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (07/22/90)

If my memory serves me correctly, EMAS Prolog (a Prolog interpreter
written in IMP, running under the EMAS operating system on ICL 2900s)
handled Prolog's several stacks by mapping each of them to a scratch
file, leaving unmapped pages in between.  This was _early_ 80s.  The
C Prolog interpreter was derived from EMAS Prolog, and it was going
to use the same scheme just as soon as Berkeley implemented mmap (the
4.1 manuals said it would be in 4.2...).  Isn't the name for this scheme
"red pages"?  

-- 
Science is all about asking the right questions.  | ok@goanna.cs.rmit.oz.au
I'm afraid you just asked one of the wrong ones.  | (quote from Playfair)

guy@auspex.auspex.com (Guy Harris) (07/24/90)

>The problem is that certain machine architectures (eg. Motorola 68K) and OS
>implementations (eg. SunOS at least in some versions) do not provide
>a continuable address violation signal (SIGSEGV), even though at the kernel
>level, address translation faults (page faults) are continuable. Not having
>looked at the insides of those machine/OS combinations, I suspect that
>enough instruction execution context can be saved for filling a page fault
>in the kernel, but not enough for reentering the faulting process to
>handle a Unix signal and allocate the needed storage.

The amount of instruction execution context is the same in both cases;
the only difference is where it has to be stored.  I think SunOS 4.1 may
store enough of it outside the kernel stack to permit *one* such fault
to be continued from (i.e., don't expect to be able to return from a
SIGSEGV that occurs while you're handling a SIGSEGV).

Part of the problem is that Motorola:

1) wouldn't commit to the the "stack puke" stored by the 680[1andup]0
   being "safe" to hand to user-mode code; i.e., they wouldn't say
   "nothing you can do to the 'stack puke' is risky";

and

2) wouldn't describe the format of the "stack puke" to the extent
   necessary to have the kernel validate it.

I can see their not doing so as being perfectly legitimate; for all I
know, different revisions of the same chip may have different "stack
puke" formats, and even if they don't, they might not want any of that
stuff to be considered a "feature", and have people then write code
dependent on that stuff and lock them into continuing to provide those
"features".  It does, however, complicate the task of allowing user-mode
code to continue from a SIGSEGV.

>These observations are the result of practical experiments 5 or so years
>ago with Sun 2's and VAXen running Berkeley Unix. The former could not
>recover correctly from the segmentation violation (PC corrupted on return
>from the signal), the latter could.

The former has more context than the latter; the former has the 68010
"stack puke", the latter has, as I remember, just the First Part Done
bit (and some of the general-purpose registers, for some of the
long-running instructions like MOVCn).

The latter is safe and, I think, appears in the "signal context"
saved by a BSD signal, so that the instruction can be continued from
user mode without the kernel having to tuck away one or more sets of
context.

>Newer machines/architectures might handle this better,

I think most RISC machines (not entirely surprisingly) have less or no
context of that sort; I'd expect things to work OK on a SPARC-based Sun,
for example, as well as a MIPS-based machine. 

In fact, what architectures other than the 68K architectures have lots
of context for that?  I don't think the 386 or the WE32K, for example,
have that problem.