leipold@eplrx7.uucp (Walt Leipold) (05/07/91)
I've been looking at Scheme as an extension language for some otherwise- portable code, with particular attention to a pair of public-domain interpreters named "scm" and "siod" (scm is actually an adaptation of siod). While these interpreters are small and complete, their implementations include a detail that raises my hackles: both of 'em garbage-collect directly from the C stack and the registers. Of course, gc'ing from the stack permits automatic C variables to be used as Scheme CONS cells, which makes the interpreter *much* simpler. But my palms always sweat when I see code that uses #define STACKS_GROW_UP... The stack is gc'd by saving the address of a main-program automatic variable, and scanning between there and the address of a current auto variable for anything that looks like a Scheme pointer. Registers are gc'd by calling setjmp() and then examining the register values saved in the jmp_buf structure. As a Portability Paranoid(TM), I find this frightening, since (for instance) I don't think that C is even required to use a stack to store activation records. I'm pretty sure that gc'ing off the C stack is non-portable practice. Is it *good* practice? In how many current implementations are C activation records *not* allocated contiguously on a stack? Should I look elsewhere for a Scheme interpreter? (Or am I worrying too much?) Thanks.... -- -------------------------------------------------------------------------- "When dealing with the insane, Walt Leipold it is best to pretend to be sane." (leipolw%esvax@dupont.com) -------------------------------------------------------------------------- -- The UUCP Mailer
phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) (05/07/91)
leipold@eplrx7.uucp (Walt Leipold) writes: >The stack is gc'd by saving the address of a main-program automatic >variable, and scanning between there and the address of a current auto >variable for anything that looks like a Scheme pointer. Registers are >gc'd by calling setjmp() and then examining the register values saved in >the jmp_buf structure. As a Portability Paranoid(TM), I find this >frightening, since (for instance) I don't think that C is even required >to use a stack to store activation records. Or at least not a stack implemented as a contiguous array. The C on IBM 370 seems to work just fine using the saveareas that said architecture commonly uses. Think of saveareas as a stack whose activation records are chained together doubly-linked. They can be allocated, but don't have to be. >I'm pretty sure that gc'ing off the C stack is non-portable practice. Is >it *good* practice? In how many current implementations are C activation >records *not* allocated contiguously on a stack? Should I look elsewhere >for a Scheme interpreter? (Or am I worrying too much?) What about systems that can detect a stack overflow and recover by creating an extension that is non-contiguous (with all the appropriate stuff put in to make sure you get back when existing). -- /***************************************************************************\ / Phil Howard -- KA9WGN -- phil@ux1.cso.uiuc.edu | Guns don't aim guns at \ \ Lietuva laisva -- Brivu Latviju -- Eesti vabaks | people; CRIMINALS do!! / \***************************************************************************/
chased@rbbb.Eng.Sun.COM (David Chase) (05/08/91)
leipold@eplrx7.uucp (Walt Leipold) writes: > ... While these interpreters are small and complete, their >implementations include a detail that raises my hackles: both of 'em >garbage-collect directly from the C stack and the registers. Of course, >gc'ing from the stack permits automatic C variables to be used as Scheme >CONS cells, which makes the interpreter *much* simpler. But my palms >always sweat when I see code that uses #define STACKS_GROW_UP... >... >I'm pretty sure that gc'ing off the C stack is non-portable practice. Is >it *good* practice? In how many current implementations are C activation >records *not* allocated contiguously on a stack? Should I look elsewhere >for a Scheme interpreter? (Or am I worrying too much?) There's good news and bad news: G) it usually works -- similar techniques are used by the Boehm-Weiser garbage collector, and I know that it has been ported to many flavors of Unix (Mach/386, SunOS/68K&Sparc, Vax, MIPS, HP). That's still not 100% of the Unix machines out there, but it is a hell of a lot. People get a little carried away about portability, anyway. Garbage collection is pretty important, and the evil bits can be isolated in a few machine-dependent files. See, for example, arisia.xerox.com:~ftp/pub/gc.tar.Z B) Optimizers are getting too good, and the quest for ever-higher SPECmarks is leading compiler-writers to squeeze every scrap of wiggle-room out of the standard (now that there is one). (This discussion has been had before on other newsgroups.) Removing this wiggle room means that pointers need not be encoded in any way that you think is "sensible" or recognizable to the garbage collector. Some people on newgroups may argue otherwise, but the only productive approach to the problem is to try to convince people that there is a market for such extensions to the language. (And I'd be very happy to work on the problem, once someone was motivated to pay me to work on it. Till then, it is just a hobby.) Alternatives include turning off the optimizer, or maintaining a linked list (stack, really) of pointers stored in stack frames. Note that this second alternative may preclude use of setjmp and longjmp over frames with associated heap pointers (clever implementation tricks may help here). David Chase Sun