torbenm@diku.dk (Torben [gidius Mogensen) (07/19/90)
Most implementations (that I know) of programming languages with stacks or heaps, use some kind of explicit check to see when a stack or heap is full. This is used to trigger events such as error messages, garbage collection or allocation of more memory for a fragmented stack. Usually, good implementations avoid testing on every use of the stack / heap, but only once for every basic block or procedure. By using the MMU facilities present in most modern computers, these tests can be entirely done in hardware, and fragmented stacks can be done transparently. This is the idea: Give each stack or heap a small number of pages of memory and make sure that the logical addresses are far apart, and that a large logical address space is given each, but where only the first small part is actually mapped to physical (or virtual) memory. Assign to each stack/heap an event that is invoked by the MMU trap handler on access to unmapped memory. An event can ask for a new physical page to be mapped to the corresponding area, or maybe try to do garbage collection. Virtual memory can handle the stack problem, as the only sensible thing to do would be to assign new pages. However, triggering garbage collection in the way described above is different from just extending the virtual memory space. Furthermore, the idea is perfectly usable even on systems with no virtual memory support. I believe heap intensive languages like LISP, Prolog or functional languages could benefit from such a strategy, especially if procedures and basic blocks are short, which is often the case for object code for these languages. My question is: has this been done? And if not, why not? I suspect that operating systems like Unix doesn't provide such facilities to user processes, but the issue is then: should they? Torben Mogensen (torbenm@diku.dk)
grunwald@foobar.colorado.edu (Dirk Grunwald) (07/20/90)
T> Assign to each stack/heap an event that is invoked by the MMU trap T> handler on access to unmapped memory. An event can ask for a new T> physical page to be mapped to the corresponding area, or maybe try to T> do garbage collection. T> T> Virtual memory can handle the stack problem, as the only sensible T> thing to do would be to assign new pages. However, triggering garbage T> collection in the way described above is different from just extending T> the virtual memory space. Furthermore, the idea is perfectly usable T> even on systems with no virtual memory ... T> T> My question is: has this been done? And if not, why not? ... T> Torben Mogensen (torbenm@diku.dk) -- Yes, see work by Andrew Appel on just this notion. I know it came out as DEC-SRC tech report, and was published elsewhere as well. I think he advocated this for the SML system. You can do something similar to this in Ultrix and SunOS using the ``mprotect'' call to protect specific pages of memory, and then use a signal handler for faults/seg violations (whatever error is normally raised). I've done this to protect the stack segments for my tasking library; when creating 1000's of tasks, it's a little wasteful of time and space (like, 2x slower and 4x larger when using 8K pages), but it certainly helps locate tasking bugs. Dirk Grunwald -- Univ. of Colorado at Boulder (grunwald@foobar.colorado.edu) (grunwald@boulder.colorado.edu)
gjc@paradigm.com (07/20/90)
> Most implementations (that I know) of programming languages with > stacks or heaps, use some kind of explicit check to see when a stack > or heap is full... > By using the MMU facilities present in most modern computers, these > tests can be entirely done in hardware... The VAX-NIL implementation of lisp, cica 1981, did this. The heap was easy, but the stack was trickier, requiring a supervisor mode handler in the case of the MAIN CONTROL stack, (and to be honest, the scheme for stack overflow detection was never implemented that way, and instead a check was made, which only added an instruction to the function call setup after entry). But in current processor technology, where you have multi-mip RISC processors running at many times faster rates than available memory bandwith you can just initiate the memory operation first, and then do the bounds check while you are waiting for the memory operation to complete. The kind of thing you might do in a microcoded lisp implementation. More complete MMU implementations that could differentiate between tagged pointer data references and untagged, and could help keep track of object volatilities on memory pages would of course really be a help. -gjc
pgl@cup.portal.com (Peter G Ludemann) (07/20/90)
IBM-Prolog has used the memory-trap technique for local and global stack overflow since 1985 or so. An addressing exception triggers garbage collection. The scheme is completely general; the error-catch mechanism can even catch stack overflow errors and do something reasonable with them (for example, I use this as a way of trapping left recursion in a grammar-rule interpreter). - peter ludemann (usual disclaimer)
hg@ecs.soton.ac.uk (Hugh Glaser) (07/20/90)
Tony Field did something like this on the original Hope/FPM system (running on VAX/VMS), I guess circa 1985.
jan@swi.psy.uva.nl (Jan Wielemaker) (07/20/90)
In article <1990Jul19.151524.22544@diku.dk> torbenm@diku.dk (Torben [gidius Mogensen) writes: > >Most implementations (that I know) of programming languages with >stacks or heaps, use some kind of explicit check to see when a stack >or heap is full. [stuff deleted] >By using the MMU facilities present in most modern computers, these >tests can be entirely done in hardware, and fragmented stacks can be >done transparently. This is the idea: [stuff deleted] >My question is: has this been done? And if not, why not? I implemented this schema for SWI-Prolog. Richard Tobin of Edinburgh University noted me at the NACLP in Cleveland last year of the possibility to reach the MMU under SunOs 4.0 via the mmap() and related calls. The schema supports expanding the stacks and triggering the garbage collector. It works as follows: 1) A file of 1 page is created on /tmp and opened for reading. 2) Computed from the command line options for the stack limits, virtual address space is reserved for all the Prolog stacks. A 1 page gap is always maintained between the gaps. 3) On a segmentation violation, the stack causing the problem is determined. On SunOs the address is passed via the signal handler; on systems that do not provide this I walk along all stacks, comparing the current top with the allocated top. Next, a page is added to the stack and a call to considerGarbageCollect() is made. Afterwards the signal handler returns. 4) considerGarbageCollect() checks whether this is a stack that needs garbage collection and how much the stack has grown after the last garbage collection on this stack. If a garbage collection is appropriate a global flag is set to indicate such. 5) On a save point (the call port for Prolog) the global flag indicating a garbage collection request is checked and if requested a garbage collection is executed. After the garbage collection all pages above the new stack top are unmapped from the address space. It is next to impossible to invoke the garbage collector direct from the signal handler. One does not know the status of the virtual machine and the stacks are not guarantied to be consistent at an arbirary point in time. The proposed schema is very simple to implement. It provides dynamically growing stacks (to some limit, but on most modern machines this limit can be high as the virtual address space is large). Expanding the stacks is almost for free. On conventional systems a stack shifter needs to be implemented. This is tiresome and shifting the stacks is an expensive operation. I have seen mmap() and related calls only on SunOs 4.0. The call maps at file a specific address in the virtual address space. SunOs both supports shared maps (writing in the mapped addresses are forwarded to the file) and private maps (writing is not forwarded to the file). For the dynamic stacks I map the same 1 page file many times using a private map at different places. On some system-V Unix machines the same results can be obtained using shared memory. In this case the shared memory segments are allocated and mapped in the stacks. I implemented this on a GOULD PN9000 under UTX 2.0? It works nice on SunOs (as an alternative to mmap()) as well. On HPUX 6.5 for one or another reason all stacks point to the same physical memory (the same bug occurs in SunOs 4.0.0 on SUN-3; only on 4.0.1 and later things are ok). Finally on AIX 2.0 on a PS2 machine shared memory segments can only be mapped on 4M boundaries, which implies only a difficult schema can be used: 1) create a new segment, larger than the old one and map it at an arbitrary place; 2) copy the contents of the old into the new; 3) deallocate the old; 4) map the new into the old place and unmap it from the temporary place. I did not yet implement this latter schema. Both schemas are not ideal. The mmap() requires a file and reads the file into the page as a page is mapped. This is a waste of resources; I just want a page of uninitialised memory and need no file. The shared memory solution is not ideal as well because shared memory objects are global resources in system-V and only a limited number of them is available to ALL processes. Also, the process is responsible for deallocating the resource, otherwise they survive the terminating process. It would be ideal to have two simple calls: map(base, lenght) unmap(base, lenght) Jan
schwartz@groucho.cs.psu.edu (Scott E. Schwartz) (07/21/90)
In article <3260@swi.swi.psy.uva.nl> jan@swi.psy.uva.nl (Jan Wielemaker) writes: | Both schemas are not ideal. The mmap() requires a file and reads the | file into the page as a page is mapped. This is a waste of resources; I | just want a page of uninitialised memory and need no file. Rather than mapping a file, you can map /dev/zero, which provides zero filled copy on write pages.
lkaplan@bbn.com (Larry Kaplan) (07/21/90)
In article <Fhtz5421@cs.psu.edu> schwartz@groucho.cs.psu.edu (Scott E. Schwartz) writes: > >In article <3260@swi.swi.psy.uva.nl> > jan@swi.psy.uva.nl (Jan Wielemaker) writes: >| Both schemas are not ideal. The mmap() requires a file and reads the >| file into the page as a page is mapped. This is a waste of resources; I >| just want a page of uninitialised memory and need no file. > >Rather than mapping a file, you can map /dev/zero, which provides zero >filled copy on write pages. Ever hear of anonymous mapped memory. Check out the 4.3BSD Architecture Manual (PS1:6) for a description of this rarely (or almost never) implemented feature. Using MAP_ANON causes the memory to be zero filled, but not necessarily copy on write. It is more meant for sharing of zero initialized pages. The file descriptor is only used for rendezvous. We've implemented it here and found it very useful. #include <std_disclaimer> _______________________________________________________________________________ ____ \ / ____ Laurence S. Kaplan | \ 0 / | BBN Advanced Computers lkaplan@bbn.com \____|||____/ 10 Fawcett St. (617) 873-2431 /__/ | \__\ Cambridge, MA 02238
pereira@alice.UUCP (Fernando Pereira) (07/21/90)
In article <1990Jul19.151524.22544@diku.dk> torbenm@diku.dk (Torben [gidius Mogensen) writes: > >By using the MMU facilities present in most modern computers, these >tests can be entirely done in hardware, and fragmented stacks can be >done transparently. This is the idea: > >Give each stack or heap a small number of pages of memory and make >sure that the logical addresses are far apart, and that a large >logical address space is given each, but where only the first small >part is actually mapped to physical (or virtual) memory. > >Assign to each stack/heap an event that is invoked by the MMU trap >handler on access to unmapped memory. An event can ask for a new >physical page to be mapped to the corresponding area, or maybe try to >do garbage collection. > [...] >My question is: has this been done? And if not, why not? > >I suspect that operating systems like Unix doesn't provide such >facilities to user processes, but the issue is then: should they? > I know of Prolog and ML implementers that have tried this, but unfortunately it is not a portable solution in Unix, even those versions that offer the mmap system call to create ``islands'' of virtual address space. The problem is that certain machine architectures (eg. Motorola 68K) and OS implementations (eg. SunOS at least in some versions) do not provide a continuable address violation signal (SIGSEGV), even though at the kernel level, address translation faults (page faults) are continuable. Not having looked at the insides of those machine/OS combinations, I suspect that enough instruction execution context can be saved for filling a page fault in the kernel, but not enough for reentering the faulting process to handle a Unix signal and allocate the needed storage. These observations are the result of practical experiments 5 or so years ago with Sun 2's and VAXen running Berkeley Unix. The former could not recover correctly from the segmentation violation (PC corrupted on return from the signal), the latter could. Newer machines/architectures might handle this better, although comments I have heard from the SML-NJ implementers indicate that there are problem configurations even now. Fernando Pereira 2D-447, AT&T Bell Laboratories 600 Mountain Ave, Murray Hill, NJ 07974 pereira@research.att.com
pereira@alice.UUCP (Fernando Pereira) (07/21/90)
In article <3260@swi.swi.psy.uva.nl> jan@swi.psy.uva.nl (Jan Wielemaker) writes: >The proposed schema is very simple to implement. It provides >dynamically growing stacks (to some limit, but on most modern machines >this limit can be high as the virtual address space is large). Expanding >the stacks is almost for free. On conventional systems a stack shifter >needs to be implemented. This is tiresome and shifting the stacks is >an expensive operation. I disagree on several counts: 1. The amortized cost of stack shifting, if the stack expansion and garbage-collection scheduling factors are computed right, is in my experience trivial (<1% or runtime). I wrote the first Prolog stack shifter, for DEC-10 Prolog, and once I got the expansion factors right, no one ever complained about stack shifting costs. I was also involved in designing and implementing the Quintus stack shifter, which again has very low amortized costs because of properly chosen expansion factors. 2. As I pointed out in an earlier message, it is not in general safe to rely on being able to return from a segmentation violation in Unix. Even if it works in your tests, it might fail for a violation occuring in particular processor state, particularly if other aspects of the state (eg. page maps) are affected by the signal function. Nothing in the OS specification guarantees that segmentation violations can be returned from. And then there are OS bugs... 3. Very large (even if sparse) address spaces are not free, because with them one gets into multilevel page tables and other address translation overheads. In my experience, VM implementations are optimized for relatively small and compact address spaces on various common machines. Large address spans can bring out the worst in paging algorithms. A well-designed bounds check mechanism costs relatively little, is portable, and does not depend on the vagaries of machine and OS implementation. The idea of using VM for bounds check has come up over and over again (we considered it for DEC-10 Prolog in 1978, for example!), but the payoff always seems mediocre compared with its complication, lack of portability and sensitivity to OS bugs. Fernando Pereira AT&T Bell Labs, Murray Hill, NJ 07974 pereira@research.att.com
mo@messy.bellcore.com (Michael O'Dell) (07/22/90)
Interlisp did this as a matter of course, and Vax-Interlisp was the single biggest reason for the originally-published (as opposed to originally discussed but not published) 4.2BSD virtual memory proposal being page-based, just like Tenex. This was finally implemented in Mach and SunOS 4.0, and is forthcoming in 4.4BSD. While this sounds like a good idea, it may not be on very fast computers. Providing user programs with completely general program state access on segmentation faults can be VERY expensive. I worked on one design whereby the kernel could do all the usual stuff (copy-on-write, etc) by "reeffecting" faulting instructions even in the face of out-of-order instruction completion, But letting the user program poke its nose around as described by the signal() interface requires "atomic trap behavior", and that makes a program run MUCH MUCH slower on some machines. For details, see my paper "Putting UNIX on Fast Computers" in the latest USENIX Anaheim Conference Proceedings. -Mike
ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (07/22/90)
If my memory serves me correctly, EMAS Prolog (a Prolog interpreter written in IMP, running under the EMAS operating system on ICL 2900s) handled Prolog's several stacks by mapping each of them to a scratch file, leaving unmapped pages in between. This was _early_ 80s. The C Prolog interpreter was derived from EMAS Prolog, and it was going to use the same scheme just as soon as Berkeley implemented mmap (the 4.1 manuals said it would be in 4.2...). Isn't the name for this scheme "red pages"? -- Science is all about asking the right questions. | ok@goanna.cs.rmit.oz.au I'm afraid you just asked one of the wrong ones. | (quote from Playfair)
guy@auspex.auspex.com (Guy Harris) (07/24/90)
>The problem is that certain machine architectures (eg. Motorola 68K) and OS >implementations (eg. SunOS at least in some versions) do not provide >a continuable address violation signal (SIGSEGV), even though at the kernel >level, address translation faults (page faults) are continuable. Not having >looked at the insides of those machine/OS combinations, I suspect that >enough instruction execution context can be saved for filling a page fault >in the kernel, but not enough for reentering the faulting process to >handle a Unix signal and allocate the needed storage. The amount of instruction execution context is the same in both cases; the only difference is where it has to be stored. I think SunOS 4.1 may store enough of it outside the kernel stack to permit *one* such fault to be continued from (i.e., don't expect to be able to return from a SIGSEGV that occurs while you're handling a SIGSEGV). Part of the problem is that Motorola: 1) wouldn't commit to the the "stack puke" stored by the 680[1andup]0 being "safe" to hand to user-mode code; i.e., they wouldn't say "nothing you can do to the 'stack puke' is risky"; and 2) wouldn't describe the format of the "stack puke" to the extent necessary to have the kernel validate it. I can see their not doing so as being perfectly legitimate; for all I know, different revisions of the same chip may have different "stack puke" formats, and even if they don't, they might not want any of that stuff to be considered a "feature", and have people then write code dependent on that stuff and lock them into continuing to provide those "features". It does, however, complicate the task of allowing user-mode code to continue from a SIGSEGV. >These observations are the result of practical experiments 5 or so years >ago with Sun 2's and VAXen running Berkeley Unix. The former could not >recover correctly from the segmentation violation (PC corrupted on return >from the signal), the latter could. The former has more context than the latter; the former has the 68010 "stack puke", the latter has, as I remember, just the First Part Done bit (and some of the general-purpose registers, for some of the long-running instructions like MOVCn). The latter is safe and, I think, appears in the "signal context" saved by a BSD signal, so that the instruction can be continued from user mode without the kernel having to tuck away one or more sets of context. >Newer machines/architectures might handle this better, I think most RISC machines (not entirely surprisingly) have less or no context of that sort; I'd expect things to work OK on a SPARC-based Sun, for example, as well as a MIPS-based machine. In fact, what architectures other than the 68K architectures have lots of context for that? I don't think the 386 or the WE32K, for example, have that problem.
jan@swi.psy.uva.nl (Jan Wielemaker) (07/24/90)
Computers and operating systems are (should be) designed to make life easier for the user and programmer. Fernando Pereira claims that the overhead of stack shifting and stack overflow checks is not very large, that this schema is much better portable and that it is less critical to OS errors. All these arguments are correct. There is another reason for which I don't like them. I do not open /dev/rxy0c to read and write files because I do not trust the OS disk cache and file system handling. Instead, I tend to use open(), read(), write() and close(). This problem is a bit similar. Using the MMU to handle the stacks, I get the following advantages: - I do not have to think where to do stack overflow checks or how to do them cleverly. [Question: how do you know how much will be written on the trail before the next point where you can savely do a stack shift?] - I do not have to write a stack shifter, nor think about when to call this thing and how much to grow the stacks. For any such parameters there always are programs for which they have the wrong value. - If stack shifts were the only thing I wanted, I would not be considered keeping track of references to the stacks (from the virtual machine, foreign language code, etc.) as the stack does not move. [Unfortunately I need to keep track of all these things for the garbage collector]. - If I use malloc()/free() or a similar package for handling the other data (notably the program), I will need to write my own versions of them because the stacks need to be kept together to avoid large unused chunks of memory. [Unluckily a dedicated perfect fit algorith performs much better than malloc()/free() of most OS's, so I wrote a dedicated memory alocation system anyway, but I think the argument nevertheless holds. Also, my dedicated algorithm does nice on Prolog, but might not very well suit foreign language code. This code now still can use malloc()/free() without conflicting with Prolog.] - Still, it IS faster (provided it works). Having access to the MMU and the possibility to return from the SEGV handler I save all the research and programming effort for this. On SunOs (from version 4.0.1) this works nice for SUN-3 and SPARC. It also works on GOULD PN9000 under UTX 2.1. Probably there are more systems that offer this. Besides all this I even gain something concrete on these systems: I really can give memory resources back to the system, so after a query that used 4 MB of stack I will not claim these 4 MB for the rest of the session. I use this after calling the garbage collector and after finishing a user query (there is a Prolog predicate for deallocating everything above the used part of the stack). I consider to use it after deep backtracking if the used part of the stack is far below the allocated part. To me it is clear that access to the MMU simplifies the task of the implementor of notably multi-stack languages. The only real question is: can an OS/hardware *in principle* do sparse addressing as efficient as handling one continuous address space? If the answer to this question is negative there is a tradeof. If the answer is positive any decent OS should provide these facilities. Jan
alanf@bruce.cs.monash.OZ.AU (Alan Grant Finlay) (07/24/90)
I agree with those who point out that using MMU segment violations to trap stack overflow is not a good idea. Apart from the reasons which have already been given, where hardware is concerned it should be used as intended. What we really need is for hardware designers to recognise the need for multiple stack overflow detection and provide a uniform mechanism to support it. I suggest a simple "notify addr" instruction which saves a limited number of addresses in hardware registers and generates a trap when these addresses are accessed. The point is that if you want to open a can then you should use a can opener, not a hammer.
pereira@alice.UUCP (Fernando Pereira) (07/25/90)
In article <3261@swi.swi.psy.uva.nl> jan@swi.psy.uva.nl (Jan Wielemaker) writes: >Computers and operating systems are (should be) designed to make life >easier for the user and programmer. Fernando Pereira claims that the >overhead of stack shifting and stack overflow checks is not very large, >that this schema is much better portable and that it is less critical >to OS errors. All these arguments are correct. There is another reason >for which I don't like them. I do not open /dev/rxy0c to read and write >files because I do not trust the OS disk cache and file system handling. >Instead, I tend to use open(), read(), write() and close(). The difference is that the interface offered by system calls such as open() is part of the specification of the OS and so can be relied upon, whereas the continuability of a SIGSEGV is not and thus cannot be expected to exist in all implementations of Unix. Furthermore, even for a particular implementation, the signal may be continuable in certain circumstances but not others. As Guy Harris pointed out in a previous message, OS/machine designers are free to do here whatever they want, since the specification (such as it is) does not require continuability. My point was not that some *possible* OS could not support the proposed use of address exceptions, but that a widely used versions of the Unix OS do not. I was talking about *current* engineering choices, not about OS design. > - I do not have to think where to do stack overflow checks or > how to do them cleverly. [Question: how do you know how > much will be written on the trail before the next point > where you can savely do a stack shift?] Except for the general unifier, which would need to do the test every N'th time through its main loop (N some appropriate margin), a safe upper bound for # of trail entries can be computed for each clause. Furthermore, the general unifier must be made interruptible anyway, to allow other asynchronous signals or some desirable support for multi-threading. > - I do not have to write a stack shifter, nor think about when > to call this thing and how much to grow the stacks. For any > such parameters there always are programs for which they have > the wrong value. It's not *that* hard! I've designed and used stack shifting schemes for the last 12 years without being severely bitten by them. The crucial point is to use a (simple) adaptive algorithm to set expansion factors. > - If stack shifts were the only thing I wanted, I would not be > considered keeping track of references to the stacks (from > the virtual machine, foreign language code, etc.) as the stack > does not move. [Unfortunately I need to keep track of all > these things for the garbage collector]. You said it. You need that info for GC anyway. >Having access to the MMU and the possibility to return from the SEGV >handler I save all the research and programming effort for this. You are trading off language implementation effort to OS design and implementation effort. Unfortunately, one's ability to influence OS design to address all our particular needs seems rather limited, and after 15 years and involvement in several Prolog implementations I have learned painfully that relying on more than the 10% most used portions of an OS (you know, those that are used 90% of the time (-:)) is very dangerous for portability and reliability. Fernando Pereira 2D-447, AT&T Bell Laboratories 600 Mountain Ave, Murray Hill, NJ 07974 pereira@research.att.com
andrewt@cs.su.oz (Andrew Taylor) (07/25/90)
In article <11079@alice.UUCP> pereira@alice.UUCP () writes: >2. As I pointed out in an earlier message, it is not in general safe to rely >on being able to return from a segmentation violation in Unix. Even if it works >in your tests, it might fail for a violation occuring in particular processor >state, particularly if other aspects of the state (eg. page maps) are affected >by the signal function. Nothing in the OS specification guarantees that >segmentation violations can be returned from. And then there are OS bugs... > >3. Very large (even if sparse) address spaces are not free, because with them >one gets into multilevel page tables and other address translation overheads. >In my experience, VM implementations are optimized for relatively small >and compact address spaces on various common machines. >Large address spans can bring out the worst in paging algorithms. > > >A well-designed bounds check mechanism costs relatively little, is portable, >and does not depend on the vagaries of machine and OS implementation. >The idea of using VM for bounds check has come up over and over again (we >considered it for DEC-10 Prolog in 1978, for example!), but the payoff >always seems mediocre compared with its complication, lack of portability >and sensitivity to OS bugs. I've been working on native code compilation for the MIPS architectures. Even if you can afford to keep the a register for each stack limit, The saving in execution time from moving checking to hardware can be at least 20%. How you assess such a gain is dependant on your aims. My aim is high performance and I'll go to great lengths for a 20% gain. A demand for high performance may be rare yet in the Prolog world but it isn't elsewhere, to quote John Mashey (from MIPS) - "we've got customers who will commit vile acts for another ten percent" Returning from a segmentation violation isn't a problem on the MIPS nor on most RISCs I expect. I don't fully understand remark (3) but I can't see how having 5 address regions instead of 3 will cause poor VM performance. I'm more worried about multiple stacks causing poor cache performance because of conflicts. Invoking the garbage collector is a problem with hardware limit checking because the stacks are likely not to be in a consistent state when an exception occurs. One possibility, which I believe the ECRC's Sepia system uses?, is for the exception routine to set a global flag which is checked at CALL ports and the garbage collector invoked if it is set. This brings back some of the costs which we were trying to avoid. Its possible if all stack data words are tagged that a robust garbage collector could operate with inconsistant stacks. If this can be made to work and having all stack data tagged is not too onerous then this is the go. Alternatively on the MIPS the exception could invoke the garbage collector which simulates instruction execution until an instruction is reached where the stacks are known to be consistant (e.g a call instruction). This will definitely involve getting your hands dirty but I believe its feasible. Andrew