[comp.lang.lisp] LISP compiler?

gdo@cornella.cit.cornell.edu (10/31/89)

Here's a novice question:
Does a lisp compiler exist? I know emacs can byte-compile lisp code,
and some lisp interpreters can optimize lisp code, but is there a
*real* lisp compiler in existence.

I assume there is, since emacs is written in lisp, and it's been
compiled.

asg@pyuxf.UUCP (alan geller) (11/09/89)

In article <20585@mimsy.umd.edu>, folta@tove.umd.edu (Wayne Folta) writes:
> "
> "Most of the LISPs I have used, however, have a "real compiler", ie one
> "that compiles LISP into machine code.  
> "
> I have a novice question concerning this. My LISP (Apple ACL) will compile
> LISP down to (apparently) machine code.  I can see it when I use the
> disassemble function.  However, it appears that the compiled code does
> calls to the LISP kernel.  Thus, the code is compiled, but not what us
> rookies would call "real compiled", i.e. stand-alone executable files.
> 
> I wonder if this is a common C-programmer misconecption.  When we think of
> "compile" we mean "compile to a standalone executable", while in LISP it
> could mean "compile to machine code that still depends on the LISP kernel
> at run time."  Is this true?  [In ACL, you compile your routines, then you
> create a new top-level loop, replacing LISP's, then you use the Standalone
> Application Generator to create a standalone application.  In version
> 1.2.2, it looks like you get all of LISP in the bargain, as the minimum
> executable size thus generated is at least 700K (as I remember).]
> 
> So, maybe questions about "[real] LISP compilers" are different from what
> LISP experts think they are?  (Shades of the "real multitasking" debates
> in the Macintosh groups :-).)
> 
> Wayne Folta          (folta@tove.umd.edu  128.8.128.42)

This is, indeed, a C-programmer misconception.  When you compile and bind
a 'C' program, you also will wind up including a substantial amount of
run-time support code:  the 'C' library functions, glue routines for calling
the operating system (ToolBox on the Mac), etc.

It's true that applications generated from ACL are much bigger than an
equivalent application written in 'C' (for small applications); this is
because the ACL run-time support code does much more than the 'C' run-time.
For instance, you get a complete garbage-collected storage management
facility, as opposed to glue into _NewHandle and _NewPointer.  You also
get all of the Common Lisp standard functions, including eval and apply,
which is to say you get a Lisp interpreter in your (compiled Lisp) application.
In ACL, you also get Object Lisp support, as well as all of the standard
pre-defined Object Lisp classes, which is to say you get the FRED editor
(emacs, to those of you who aren't familiar with ACL).

Obviously, you're normal 'C' application doesn't get any of this from its
run-time support code.  For a small application, or even a medium-sized
one, you may use only a little bit of it, in which case all of that code
is essentically wasted space.  For a large application that wound up using
more of the ACL run-time support, you might actually wind up with a smaller
application than you would if you rewrote it in 'C', and you would certainly
save enormously in development time.

As an aside, the ACL group at Apple does realize that most applications don't
need all of the code that gets included, and they are working on ways to 
avoid including support that you don't use.

Anyway, the bottom line is that the ACL compiler, like most modern Lisp
compilers, really DOES compile your code into native machine code that
executes quite quickly (as an aside, including all of the extra support stuff
is mostly a problem for memory usage and disk usage; it won't directly
impact performance, although the increased memory usage does have an effect).

Alan Geller
Regal Data Systems
...!{princeton,rutgers}!bellcore!pyuxf!asg

Nobody at Bellcore listens to me; nobody at Regal does either.
My ideas and comments are my own.

kend@tekchips.LABS.TEK.COM (Ken Dickey) (11/10/89)

Reply-To: kend@mrloog.WR.TEK.COM (Ken Dickey)
Followup-To: a.out
Distribution: 
Organization: Tektronix, Inc., Beaverton,  OR.
Keywords: Lisp, Scheme, linking

In article <5081@internal.Apple.COM> chewy@apple.com (Paul Snively) writes:
>In article <89Nov7.234038est.2758@neat.cs.toronto.edu> 
>rayan@cs.toronto.edu (Rayan Zachariassen) writes:
>> chewy@apple.com (Paul Snively) writes:
>> 
>> >A future version of MACL WILL do a treewalking dead-code stripper, 
>reducing 
>> >application sizes tremendously.
>> 
>> yah, well, this is nice but a bit shortsighted as a general solution.
>> 
>> - What I really want is to be able to run small standalone lisp 
>applications.
>> I want to write cat in list (for example) without it ending up as a 2Meg
>> binary.  This is addressed by the pruner you mention (someone from Lucid
>> told me they have one or are planning one too).
>> 
>> - I want to run *many* different standalone lisp programs.  Imagine
>> 30% of unix commands being implemented in lisp.  See where the code 
>stripping
>> idea falls short?  You'll have all this code duplicated in most commands.
>> 
>> Solution?  Put the Lisp runtime support into a shared image, which the
>> application link to when they start up, like (sun/svr4) shared libraries.
>> That is in fact what it is, the Lisp equivalent to libc.  All lisp 
>applications
>> would share the runtime text, and have independent data pages per 
>process.
>> 
>> One would think this isn't too complicated to do.
>
>Not only is it difficult to do under the current Macintosh operating 
>system, it's impossible.
>

Let's seperate out the 2 concerns:

  The key to being able to link an application--rather than GC and save
  an image--is not to call EVAL or the compiler.  Either of these must
  have a full runtime support since arbitrary code may be introduced.
  Without them, linking is relatively straight-forward.

  The Mac's (lack of) OS problem (no multitasking, & friends) are of a
  more severe nature.

A couple of date points:

  Standalone Mac applications can be generated by LightShip's
  MacScheme+ToolSmith. 

  Cadence Research System's Chez Scheme creates sharable binaries
  (Under Unix; I don't know if their VMS or Domain systems do this).
  They have an `Application Builder' which generates standalone
  binaries.

I hope this helps.

-Ken Dickey			kend@mrloog.WR.TEK.COM

kend@tekchips.LABS.TEK.COM (Ken Dickey) (11/10/89)

In article <4327@hplabsz.HPL.HP.COM> mayer@hplabs.hp.com (Niels Mayer) writes:
>I think the biggest problem with lisp is that it lets you be sloppy with
>memory management... most big lisp systems I've worked with generate
>garbage in low-level modules that must eventually be garbage collected.  If
>you try to optimize this by rewriting your modules to be "cons-free", the
>code becomes so spaghetti-like that you might as well be writing it in C.
>How can compilation solve the problem of lisp's memory inefficiencies?
>

I think that you are overgeneralizing your experience.  I have 2 suggestions:

  (1) Change your coding style so that your declarations are compiled
statically [i.e. write your code in a *clean* style which does not cons].
[I don't buy your spagetti argument].

  (2) Get a runtime system which is written not to cons.  Demanding better
runtime libraries from your suppliers is a worthwhile occupation.

By the way, what do you mean by Lisp (MacLisp, InterLisp, Scheme,
CommonLisp, Franz...)?

-Ken Dickey			kend@mrloog.WT.TEK.COM

dlw@odi.com (Dan Weinreb) (11/10/89)

At least two implementations of Lisp on time-sharing systems in which
the runtime was shared between different user processes (only one copy
on disk, only one copy in main memory) existed over ten years ago:
PDP-10 Maclisp running under the ITS time-sharing system on PDP-10's
and DecSystem-20's at MIT, and Multics Maclisp running under Multics
on Honeywell 6180's at several sites.  Both worked essentially like
Sun shared libraries.  Unfortunately, my knowledge about modern Lisp
implementations is less detailed.

moore%cdr.utah.edu@cs.utah.edu (Tim Moore) (11/10/89)

In article <5140@internal.Apple.COM> chewy@apple.com (Paul Snively) writes:
>I'm not sure that the example of passing a LAMBDA expression to APPLY is 
>valid; there's nothing whatsoever to prevent LAMBDA expressions from being 
>compiled (I'm pretty sure that LAMBDA expressions in MACL are compiled, at 
>any rate).

The problem is that the argument to APPLY and FUNCALL is a lambda
expression, which can be an arbitrary list, consed up at runtime. With
the appropriate use of THE, you could specify that this would never
happen, or the compiler might be able to figure that out itself
through data flow analysis. Good luck with the general case.

This is another example of why saying 
'(lambda (foo bar) (list foo bar)) 
when you mean
#'(lambda (foo bar) (list foo bar))
is bad; the compiler can't do anything with the former (not to mention
lexical scoping issues.)
Tim Moore                     moore@cs.utah.edu {bellcore,hplabs}!utah-cs!moore
"Ah, youth. Ah, statute of limitations."
		-John Waters

chewy@apple.com (Paul Snively) (11/11/89)

In article <21192@uflorida.cis.ufl.EDU> mfi@serc.cis.ufl.edu (Mark 
Interrante) writes:
> I view MACL
> as the prototyping language for Mac applications.

Apple views MACL as an environment for developing shippable applications, 
not merely for prototyping them.

In article <21192@uflorida.cis.ufl.EDU> mfi@serc.cis.ufl.edu (Mark 
Interrante) writes:
> As such I need a fast
> nice object system such as Apple CLOS, not PCL.

Of course we'll have our own CLOS, just not in 1.3. :-)

In article <21192@uflorida.cis.ufl.EDU> mfi@serc.cis.ufl.edu (Mark 
Interrante) writes:
> I need all the great
> examples that MACL has given out converted and consolidated to work
> with the new system.

By "new system," I presume you mean CLOS.  Of course we'll do that too.

In article <21192@uflorida.cis.ufl.EDU> mfi@serc.cis.ufl.edu (Mark 
Interrante) writes:
> I just want high level objects that correspond
> to MAC interface parts.

We've already got that, and will keep it.

In article <21192@uflorida.cis.ufl.EDU> mfi@serc.cis.ufl.edu (Mark 
Interrante) writes:
> In addition, I would like to see improved
> support for printing, and version7 stuff.

What exactly does this mean?  We will definitely support System 7.  What 
do you need relative to printing?

In article <21192@uflorida.cis.ufl.EDU> mfi@serc.cis.ufl.edu (Mark 
Interrante) writes:
> I think it is important that Apple keep MACL progressing in the
> direction of new features like CLOS and CLIM.  CLIM is important.
> Apple is in the interface/easy-to-use camp and CLIM is a UIMS, it seems
> like a natural fit.

We certainly agree about CLOS.  CLIM may or may not be offered as an 
option for the system.  It's not a priority for us, as we already have a 
user environment that we'd like to keep consistent, and which is, in our 
collective opinions, in many ways better than CLIM.

In article <21192@uflorida.cis.ufl.EDU> mfi@serc.cis.ufl.edu (Mark 
Interrante) writes:
> The inspector doesnt allow editing yet

It does if it can find the source code.

In article <21192@uflorida.cis.ufl.EDU> mfi@serc.cis.ufl.edu (Mark 
Interrante) writes:
> FRED hasnt changed

Actually, it has, but I need to know more about what you want from it.

In article <21192@uflorida.cis.ufl.EDU> mfi@serc.cis.ufl.edu (Mark 
Interrante) writes:
> the manual needs revising

Done, as of 1.3.

In article <21192@uflorida.cis.ufl.EDU> mfi@serc.cis.ufl.edu (Mark 
Interrante) writes:
> the interface generator
> tool is embarassing in comparison to the NeXT.

Also vastly improved as of 1.3.

MACL 1.3 should be at APDA within the next two weeks or so (I just got the 
frozen code a few days ago).

__________________________________________________________________________
Just because I work for Apple Computer, Inc. doesn't mean that they 
believe what I believe or vice-versa.
__________________________________________________________________________
C++ -- The language in which only friends can access your private members.
__________________________________________________________________________

chewy@apple.com (Paul Snively) (11/11/89)

In article <4327@hplabsz.HPL.HP.COM> mayer@hplabsz.HPL.HP.COM (Niels 
Mayer) writes:
> In article <5057@internal.Apple.COM> chewy@apple.com (Paul Snively) 
writes:
> >Your description of what Macintosh Allegro Common Lisp does is 
completely 
> >accurate, obviously, which is a fancy way of saying that the current 
> >version(s) of MACL don't have what amounts to a "smart linker;" that 
is, 
> >unused code from your application is NOT currently stripped out 
(although 
> >strictly development-related things like the compiler itself are).  A 
> >future version of MACL WILL do a treewalking dead-code stripper, 
reducing 
> >application sizes tremendously.
> 
> So what does MACL's fancy dead-code stripper do when your program happens
> to call 'eval' or 'apply'?

It depends upon what the arguments to either of these functions are.  For 
EVAL, I'd expect it to throw up its hands in despair, like MacScheme does. 
 Or perhaps it just needs to know if the expression can be "found" in the 
current environment (i.e. (EVAL (READ)) wouldn't work).

__________________________________________________________________________
Just because I work for Apple Computer, Inc. doesn't mean that they 
believe what I believe or vice-versa.
__________________________________________________________________________
C++ -- The language in which only friends can access your private members.
__________________________________________________________________________

lou@bearcat.rutgers.edu (Lou Steinberg) (11/11/89)

In article <5031@tekcrl.LABS.TEK.COM> kend@tekchips.LABS.TEK.COM (Ken Dickey) writes:

>   The key to being able to link an application--rather than GC and save
>   an image--is not to call EVAL or the compiler.  Either of these must
>   have a full runtime support since arbitrary code may be introduced.
>   Without them, linking is relatively straight-forward.
> 
> -Ken Dickey			kend@mrloog.WR.TEK.COM

Yes, but....

Even if you don't call EVAL, your runtimes may have surprising calls
to eval in them.  E.g. in common lisp, if you use READ, and don't
remove the #.  read macro (which causes read-time evaluation of the
next s-expr read) from your read-table, then the "application maker"
can't remove any part of lisp.

Worse, what about functions like APPLY?  If you just give compiled
functions to apply you are OK, but APPLY can also take a LAMBDA
expression, i.e. it needs to be able to call the interpreter,
depending on the type of argument it is given.  It may not be so easy
for a compiler to determine whether a given argument will, at runtime,
be a lambda expression.  And suppose the argument is a symbol as in
(apply 'foo ...).  How easily can the compiler tell that at runtime
foo's definition will be compiled code rather than uncompiled code?

Well, suppose we do without APPLY.  Unpleasant, but maybe worth it?
But then what about all the other functions that take a function as an
argument, like MAPCAR and SORT?

About the only solution I see is to invent some kind of declaration
that lets the programmer inform the compiler / application maker that
it can safely remove EVAL.
-- 
					Lou Steinberg

uucp:   {pretty much any major site}!rutgers!aramis.rutgers.edu!lou 
arpa:   lou@cs.rutgers.edu

chewy@apple.com (Paul Snively) (11/11/89)

In article <Nov.10.13.33.46.1989.3812@bearcat.rutgers.edu> 
lou@bearcat.rutgers.edu (Lou Steinberg) writes:
[All kinds of stuff about functional arguments]

Actually, it's not as bad as it sounds; the whole point behind having a 
treewalker is to be able to determine whether or not the parameters to 
some of these "metafunctions" are completely (deterministically) specified 
within the context of the Lisp environment or not; (EVAL (READ)) is just 
the most obvious example of one that isn't.

I'm not sure that the example of passing a LAMBDA expression to APPLY is 
valid; there's nothing whatsoever to prevent LAMBDA expressions from being 
compiled (I'm pretty sure that LAMBDA expressions in MACL are compiled, at 
any rate).

__________________________________________________________________________
Just because I work for Apple Computer, Inc. doesn't mean that they 
believe what I believe or vice-versa.
__________________________________________________________________________
C++ -- The language in which only friends can access your private members.
__________________________________________________________________________

barmar@leander.think.com (Barry Margolin) (11/11/89)

In article <1989Nov10.154943.18376@hellgate.utah.edu> moore%cdr.utah.edu@cs.utah.edu (Tim Moore) writes:
>The problem is that the argument to APPLY and FUNCALL is a lambda
>expression, which can be an arbitrary list, consed up at runtime.

This particular feature is being taken out of ANSI Common Lisp.  The
definition of the FUNCTION data type is being tightened, and it will refer
to a specific, distinguishable data type.  The FUNCTION special form and
the SYMBOL-FUNCTION function will be guaranteed to return objects of this
type.  Functions that take functional arguments will accept either a
function or a symbol (automatically calling SYMBOL-FUNCTION on the latter);
they will no longer take lambda lists.  Lambda lists will be coercible to
functions with (COERCE <lambda exp> 'FUNCTION); this is defined to be
equivalent to (EVAL '(FUNCTION <lambda exp>)).

So, returning to the issue of the code walker that removes unnecessary
runtime functions, APPLY, FUNCALL, MAP, etc. will not force inclusion of
everything.  EVAL and COMPILE definitely will, and COERCE may if its second
argument is non-constant or the symbol FUNCTION.  READ also may force
inclusion of everything, unless you've disabled "#.".
Barry Margolin, Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar

chewy@apple.com (Paul Snively) (11/12/89)

In article <5130@internal.Apple.COM> chewy@apple.com (that's me) wrote:
> CLIM may or may not be offered as an 
> option for the system.  It's not a priority for us, as we already have a 
> user environment that we'd like to keep consistent, and which is, in our 
> collective opinions, in many ways better than CLIM.

Ouch.  The first sentence is true; everything after that, I'm told, is 
quasi-religious nonsense.  Apple is more interested now than we ever have 
been before in connectivity (witness our relatively-new TCP/IP support and 
our already-announced-but-not-yet-shipping X-Windows server for the Mac 
OS).

Mea culpa, mea maxima culpa.  My apologies to CLIM fans everywhere.  By 
the way, in the future, I'll make every attempt to address technical 
concerns regarding MACL here (since that's my job), and every attempt to 
let the MACL product manager address "future directions" issues (since 
that's her job).

__________________________________________________________________________
Just because I work for Apple Computer, Inc. doesn't mean that they 
believe what I believe or vice-versa.
__________________________________________________________________________
C++ -- The language in which only friends can access your private members.
__________________________________________________________________________

asg@pyuxf.UUCP (alan geller) (11/13/89)

As long as this newsgroup is temporarily becoming a forum for requesting
enhancements to Mac Allegro Common Lisp, for the 2.0 release:

1) Would it be possible to use an incremental garbage collector, rather than
	the current stop-and-copy?  Given that MACL 2.0 will usually be
	running under System 7.0, we'll have lots of virtual address space
	for from-space and to-space, maybe even multiple generations.
	For running MACL as a prototyping environment, this isn't a big
	deal, but in a real application it annoys users when their cursor
	changes to 'GC' and their system freezes for 10 seconds.
	Even if there were a 'MACL 2.0 extended' that had an incremental
	garbage collector as an extra-cost option, that would be better than
	nothing.

2) Serious rumor has it that MACL 2.0 will support CLOS directly, rather than
	by using a generic implementation like PCL.  I applaud this, but I
	have a couple of requests along this line:  please make sure that
	print and describe are implemented as generic functions!  For those
	of us who use objects extensively, this is very important.  Since
	The current Object Lisp does this right, I assume that MACL 2.0 will
	also, but I just want to stress its importance.

Alan Geller
Regal Data Systems
...!{rutgers,princeton}!bellcore!pyuxf!asg

miller@CS.ROCHESTER.EDU (Brad Miller) (11/14/89)

[Protect me from mailer bugs!]

    Date: 10 Nov 89 17:45:10 GMT
    From: chewy@apple.com (Paul Snively)

    We certainly agree about CLOS.  CLIM may or may not be offered as an 
    option for the system.  It's not a priority for us, as we already have a 
    user environment that we'd like to keep consistent, and which is, in our 
    collective opinions, in many ways better than CLIM.

I'd like to hear more about this; 

1) I don't understand in what ways it is better than CLIM

2) I thought the point of CLIM was to make UI code portable across
environments; thus stuff I develop (say) on my Symbolics I can get to run
pretty much identically on the Mac. Not supporting CLIM (or some generic
standardized interface manager; you may have specific quibbles with CLIM in
particular) hinders this.

Seems like CLIM is still in the early stages, so the strategy might be to
get what "advantages" you see in your system into CLIM rather than be on
your own....

lou@bearcat.rutgers.edu (Lou Steinberg) (11/14/89)

In article <31645@news.Think.COM> barmar@leander.think.com (Barry Margolin) writes:

> Functions that take functional arguments will accept either a
> function or a symbol (automatically calling SYMBOL-FUNCTION on the latter);
> they will no longer take lambda lists.

You still have to be sure somehow that the user hasn't consed up a
runtime list and used setf of symbol-definition to store it as the
definition of some symbol.

(By the way, as at least one poster has pointed out, in my original
message when I spoke about "lambda expressions" causing problems, what
I meant was "lists consed up at runtime being used as function
definitions".  My sloppy wording appears to have confused several
people, for which I apologize.)
-- 
					Lou Steinberg

uucp:   {pretty much any major site}!rutgers!aramis.rutgers.edu!lou 
arpa:   lou@cs.rutgers.edu

barmar@leander.think.com (Barry Margolin) (11/14/89)

In article <Nov.13.16.59.55.1989.4053@bearcat.rutgers.edu> lou@bearcat.rutgers.edu (Lou Steinberg) writes:
>In article <31645@news.Think.COM> barmar@leander.think.com (Barry Margolin) writes:
>> Functions that take functional arguments will accept either a
>> function or a symbol (automatically calling SYMBOL-FUNCTION on the latter);
>> they will no longer take lambda lists.
>You still have to be sure somehow that the user hasn't consed up a
>runtime list and used setf of symbol-definition to store it as the
>definition of some symbol.

Nope, that won't be permitted.  When you do 

	(setf (symbol-function <symbol>) <expression>)

the value of <expression> will have to be a function object; a list will
not be acceptable in portable code.  If <expression> is a list, then you'll
have to do

	(setf (symbol-function <symbol>) (coerce <expression> 'function))
Barry Margolin, Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar

barmar@leander.think.com (Barry Margolin) (11/16/89)

In article <616@pyuxf.UUCP> asg@pyuxf.UUCP (alan geller) writes:
>2) Serious rumor has it that MACL 2.0 will support CLOS directly, rather than
>	by using a generic implementation like PCL.  I applaud this, but I
>	have a couple of requests along this line:  please make sure that
>	print and describe are implemented as generic functions!

The current state of things in X3J13 specifies that DESCRIBE and PRINT call
the generic functions DESCRIBE-OBJECT and PRINT-OBJECT.  So, it's not CLOS
unless they do this.
Barry Margolin, Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar

jac@cl.cam.ac.uk (John Carroll) (11/17/89)

In article <616@pyuxf.UUCP> asg@pyuxf.UUCP (alan geller) writes:
>
>2) Serious rumor has it that MACL 2.0 will support CLOS directly, rather than
>	by using a generic implementation like PCL.  I applaud this, but I
>
>...

Why are people waiting for MACL 2.0 in order to get CLOS on the Mac? 2.0
must some way off since 1.3 has only just been released.

PROCYON Common Lisp already has a directly-supported implementation of CLOS
(which is integrated into all the graphics stuff). Procyon's CLOS came out
in the early summer and they claim it to be fast.

(US distributors are Expertelligence  805 9671797
Rest of world are Procyon Research Ltd  +44 223 421221)

John Carroll
jac@cl.cam.ac.uk

malczews@girtab.usc.edu (Frank Malczewski) (11/19/89)

In article <1659@gannet.cl.cam.ac.uk> jac@cl.cam.ac.uk (John Carroll) writes:
>In article <616@pyuxf.UUCP> asg@pyuxf.UUCP (alan geller) writes:
>>
>>2) Serious rumor has it that MACL 2.0 will support CLOS directly, rather than
>>	by using a generic implementation like PCL.  I applaud this, but I
>>
>>...
>
>Why are people waiting for MACL 2.0 in order to get CLOS on the Mac? 2.0
>must some way off since 1.3 has only just been released.
>
>PROCYON Common Lisp already has a directly-supported implementation of CLOS
>(which is integrated into all the graphics stuff). Procyon's CLOS came out
>in the early summer and they claim it to be fast.
>
>(US distributors are Expertelligence  805 9671797
>Rest of world are Procyon Research Ltd  +44 223 421221)
>
>John Carroll
>jac@cl.cam.ac.uk



Yes, Procyon CL has just about everything, at a price.  It is quite expensive
in comparison to MACL, so some of us choose to wait a (hopefully not too long)
while for MACL to catch up.

h
u
g
s

&

k
i
s
s
e
s

mailer of my life
--

  -- Frank Malczewski		(malczews@girtab.usc.edu)

gintera@cpsc.ucalgary.ca (Andrew Ginter) (11/24/89)

In article <616@pyuxf.UUCP>, asg@pyuxf.UUCP (alan geller) writes:
> 1) Would it be possible to use an incremental garbage collector, rather than
> 	the current stop-and-copy?  Given that MACL 2.0 will usually be
> 	running under System 7.0, we'll have lots of virtual address space
> 	for from-space and to-space, maybe even multiple generations.

A real time (parallel or incremental) garbage collector is roughly
twice as costly as a comparable stop-and-collect collector.  The real
time version has to collect more frequently than stop-and-copy because
the real time version has to start it's collection well before memory
is exhausted.  A real time "two space copying" collector is even more
expensive because of the cost of checking for a forwarding pointer on
every reference to GC managed memory.  Real time collectors penalize
real performance while improving some aspects of perceived
performance.

So if you're going to add a real time collector, make it an option.

Andrew Ginter, 403-282-2984, gintera@CPSC.UCALGARY.CA, Ginter@UNCAMULT.BITNET

bonham@wglen.UUCP (Mike Bonham) (11/25/89)

In article <2159@cs-spool.calgary.UUCP>, gintera@cpsc.ucalgary.ca (Andrew Ginter) writes:
> 
> A real time (parallel or incremental) garbage collector is roughly
> twice as costly as a comparable stop-and-collect collector.  The real
> time version has to collect more frequently than stop-and-copy because
> the real time version has to start it's collection well before memory
> is exhausted.  A real time "two space copying" collector is even more
> expensive because of the cost of checking for a forwarding pointer on
> every reference to GC managed memory.  Real time collectors penalize
> real performance while improving some aspects of perceived
> performance.
> 
> So if you're going to add a real time collector, make it an option.
> 
As Mr. Ginter correctly points out, the total number of CPU cycles expended
to perform incremental garbage collection is substantially greater than for
an equivalent stop-and-collect algorithm.

But as he also mentions (but perhaps doesn't emphasize enough, IMHO), if an
application involves an interactive user interface (as many Lisp programs
do!!!), then incremental GC can greatly improve the *perceived* responsiveness.
This is primarily due to two factors:

Firstly, the system can do garbage collection while waiting for the next
user-supplied command or input event, when CPU cycles would otherwise go
to waste.

Secondly, when the next user-event does come in, incremental garbage collectors
can be suspended immediately, whereas stop-and-collect garbage collectors
must run to completion (or else be aborted and rolled-back, if that's
possible and preferable).

There is no reason one can't have a two-collector system: a background,
incremental GC can clean up whatever it can during user interface rest-stops,
but when heavy computations run completely out of space the heavy-duty,
highly efficient, stop-and-collect GC is called in.

There *are* incremental GC methods (eg. linear mark-and-sweep) that don't
impose the continuous run-time penalty of checking for forwarding pointers.
Fragmentation can be dealt with periodically during a stop-and-copy GC.

Properly-done background incremental GC ought to be a win in typical
interactive applications, and no great burden in compute-bound applications, 

-- Mike Bonham       CPSC.UCalgary.CA!wglen!bonham
--

kend@tekchips.LABS.TEK.COM (Ken Dickey) (11/25/89)

> A real time (parallel or incremental) garbage collector is roughly
> twice as costly as a comparable stop-and-collect collector.  The real
> time version has to collect more frequently than stop-and-copy because
> the real time version has to start it's collection well before memory
> is exhausted.  A real time "two space copying" collector is even more
> expensive because of the cost of checking for a forwarding pointer on
> every reference to GC managed memory.  Real time collectors penalize
> real performance while improving some aspects of perceived
> performance.

This is not the case if you use your VM hardware.  See "Real-time
Concurrent Collection on Stock Multiprocessors" by Appel, Ellis, &
Lee; Princeton U. tech. report: CS-TR-133-88.  The method used also
allows multiple allocators.


-Ken Dickey		kend@mrloog.WR.TEK.COM

caroma@wheaties.ai.mit.edu (Carl R. Manning) (11/28/89)

In article <4@wglen.UUCP> bonham@wglen.UUCP (Mike Bonham) writes:
>...if an
>application involves an interactive user interface (as many Lisp programs
>do!!!), then incremental GC can greatly improve the *perceived* responsiveness.
>...
>-- Mike Bonham       CPSC.UCalgary.CA!wglen!bonham
>-- 
If you're just dealing with human interfaces rather than real time
requirements, then incremental garbage collection isn't the only way
to improve perceived performance --- instead you can try to schedule the long
scavenges during otherwise unresponsive periods, e.g. during
compute-bound operations or during idle periods.  For more on this idea,
see Section 4: Scavenge Scheduling, (p 29) of: 

   Design of the Opportunistic Garbage Collector
   Paul R. Wilson and Thomas G. Moher
   OOPSLA'89 Proceedings, 1-6 October, 1989
   SIGPLAN Notices vol24 n10, p 23-35

(This paper also includes some other ideas for implementing
generational GC on stock hardware and some performance measurements.)

(Just passing on a reference -- I have no experience using it.)

CarlManning@ai.mit.edu

caroma@wheaties.ai.mit.edu (Carl R. Manning) (11/28/89)

In article <4@wglen.UUCP> bonham@wglen.UUCP (Mike Bonham) writes:
>...if an
>application involves an interactive user interface (as many Lisp programs
>do!!!), then incremental GC can greatly improve the *perceived* responsiveness.
>...
>-- Mike Bonham       CPSC.UCalgary.CA!wglen!bonham
>-- 
If you're just dealing with human interfaces rather than real time
requirements, then incremental garbage collection isn't the only way
to improve perceived performance --- instead you can try to schedule the long
scavenges during otherwise unresponsive periods, e.g. during
compute-bound operations or during idle periods.  For more on this idea,
see Section 4: Scavenge Scheduling, (p 29) of: 

   Design of the Opportunistic Garbage Collector
   Paul R. Wilson and Thomas G. Moher
   OOPSLA'89 Proceedings, 1-6 October, 1989
   SIGPLAN Notices vol24 n10, p 23-35

(This paper also includes some other ideas for implementing
generational GC on stock hardware and some performance measurements.)

(Just passing on a reference -- I have no experience using it.)

CarlManning@ai.mit.ed

wilson@carcoar.Stanford.EDU (Paul Wilson) (11/29/89)

In article <5145@wheat-chex.ai.mit.edu> CarlManning@ai.mit.edu writes:
>In article <4@wglen.UUCP> bonham@wglen.UUCP (Mike Bonham) writes:
>If you're just dealing with human interfaces rather than real time
>requirements, then incremental garbage collection isn't the only way
>to improve perceived performance --- instead you can try to schedule the long
>scavenges during otherwise unresponsive periods, e.g. during
>compute-bound operations or during idle periods.  For more on this idea,
>see Section 4: Scavenge Scheduling, (p 29) of: 
>
>   Design of the Opportunistic Garbage Collector
>   Paul R. Wilson and Thomas G. Moher
>   OOPSLA'89 Proceedings, 1-6 October, 1989
>   SIGPLAN Notices vol24 n10, p 23-35
>
>(This paper also includes some other ideas for implementing
>generational GC on stock hardware and some performance measurements.)
>
>(Just passing on a reference -- I have no experience using it.)
>
>CarlManning@ai.mit.edu

  I should probably point out that ours is not the first gc to improve
  perceived responsiveness by opportunistic scheduling.  On the other
  hand, we use opportunism both to improve responsiveness *and* to
  actually increase efficiency.

  Jon L White has reminded me that the Interlisp gc used the strategy
  of preferentially scheduling gc work during long user pauses ("think
  time").

  Also, the Symbolics systems make the background scavenger of their
  incremental gc work harder during pauses than during program execution.
  (Even though the system is incremental, performance can be noticeably
  degraded by the background scavenger, so it's better for it to do
  its job when the user doesn't care.)

  Our gc is different in that it schedules gc's preferentially when
  there is likely to be less work to do. It's a copying collector,
  so the work it does is proportional to the amount of *live* data.

  By scheduling scavenges between major computations, we not only
  hide pauses but *reduce* them, since there are usually fewer
  live objects (holding intermediate results of ongoing computations)
  then.  This didn't happen with the Interlisp gc, because it
  wasn't a copy collector.  It also doesn't happen with Symbolics
  gc because it is the scheduling of *flips* (root set scans) that
  primarily determines the amount of data seen as live by the gc.

  There are really several different issues here.  One is efficiency,
  another is disruptiveness, and another is real-time response.  Only
  a fine-grained incremental gc can guarantee short pauses and
  real-time response.  A non-incremental generational gc can usually
  come close enough for most people's purposes.  (If you're willing
  to do a little tuning, you may be able to guarantee acceptable
  pauses for a  particular application by forcing scavenges of
  the right numbers of generations at the right times.  Our system
  attempts to do this acceptably well automatically, but may fail.)

  Appel, Ellis, and Li's gc avoids the continual checking for
  forwarding pointers that a fine-grained incremental gc requires.
  I wouldn't quite call it real-time, though, because the unit of
  work is fairly large (copying the referents of all of the pointers
  in a page) and these units may come in bursts.  It may not work
  as well for the youngest generation of a generational system,
  because a higher percentage of pages in younger generations is
  likely to be touched in a short period of time.  So you still
  may need tuning or opportunism, and you might want to use
  a hybrid scheme.  (For example, opportunistic generation scavenging
  for the youngest generation, and Appel-Ellis-Li style incremental
  scavenging of older generations.)  For difficult real-time
  applications, you'll still need a fine-grained incremental gc
  with its pointer-checking overhead.

  Another issue our paper dealt with is the number of generations a
  generational gc should have.  We maintain that it's at least 3.
  Our own numbers at this point are still rather preliminary, but
  we've gotten several anecdotal reports lately of real-world
  programs that a two-generations scheme falls down on.  (That is,
  they have too much data with intermediate lifetimes that force
  full gc's too often.)  Having more generations can have drawbacks
  too, but they seem less severe, and can usually be handled with
  a teeny bit of opportunistic scheduling.  (Note: your mileage
  *will* vary.  A two-generation scheme seems to be absolutely
  fine for most people's programs.  It's just some troublemakers
  out there... :-).

  Also, our current view is that a mixed strategy is probably best
  for tracking intergenerational pointers.  For most programs,
  Ungar's remembered set scheme incurs less overhead than our
  "card marking," but is occasionally bad for large, long-lived objects.
  (E.g., you may get 4-megabyte array in your remembered set and
  scan it entirely at every scavenge!).  Probably the way to go
  is to segregate large objects into a separate "large object
  area" as Tektronix Smalltalk does, and to use card marking in
  this area (only) to avoid scanning the whole objects.

      -- Paul 

Paul R. "Garbage Man" Wilson                         
Software Systems Laboratory               lab ph.: (312) 996-9216
U. of Illin. at C. EECS Dept. (M/C 154)   wilson@carcoar.stanford.edu
Box 4348   Chicago,IL 60680 
Paul R. Wilson                         
Software Systems Laboratory               lab ph.: (312) 996-9216
U. of Illin. at C. EECS Dept. (M/C 154)   wilson@carcoar.stanford.edu
Box 4348   Chicago,IL 60680

gintera@cpsc.ucalgary.ca (Andrew Ginter) (12/06/89)

In article <5117@tekcrl.LABS.TEK.COM>, Ken Dickey writes:
> > A real time (parallel or incremental) garbage collector is roughly
> > twice as costly as a comparable stop-and-collect collector ...
>
> This is not the case if you use your VM hardware.  See "Real-time
> Concurrent Collection on Stock Multiprocessors" by Appel, Ellis, &
> Lee; Princeton U. tech. report: CS-TR-133-88.

Appel, Ellis & Li's technique reduces the cost of checking for
forwarding pointers.  It does nothing to address the performance
penalty associated with the increased frequency of garbage collection
in a real time or incremental system (Philip L. Waldler, CACM, Sept/76).
Waldler concludes that parallel collectors consume twice the resources
of serial collectors, almost all of the time, even when using algorithms
without any forwarding pointers.

Andrew Ginter, 403-282-2984, gintera@CPSC.UCALGARY.CA, Ginter@UNCAMULT.BITNET

wilson@carcoar.Stanford.EDU (Paul Wilson) (12/07/89)

In article <2203@cs-spool.calgary.UUCP> gintera@cpsc.ucalgary.ca (Andrew Ginter) writes:
>In article <5117@tekcrl.LABS.TEK.COM>, Ken Dickey writes:
>> > A real time (parallel or incremental) garbage collector is roughly
>> > twice as costly as a comparable stop-and-collect collector ...
>>
>> This is not the case if you use your VM hardware.  See "Real-time
>> Concurrent Collection on Stock Multiprocessors" by Appel, Ellis, &
>> Lee; Princeton U. tech. report: CS-TR-133-88.
>
>Appel, Ellis & Li's technique reduces the cost of checking for
>forwarding pointers.  It does nothing to address the performance
>penalty associated with the increased frequency of garbage collection
>in a real time or incremental system (Philip L. Waldler, CACM, Sept/76).
>Waldler concludes that parallel collectors consume twice the resources
>of serial collectors, almost all of the time, even when using algorithms
>without any forwarding pointers.

  Maybe not.  You need more memory, sure, but it doesn't have to be real
  memory.  The locality characteristics of garbage-collected systems
  are pretty strange, and you can use an incremental garbage collector
  to *improve* locality dynamically.  (See Jon L. White's paper on
  the basic idea in the 1980 Lisp conference proceedings, and Bob
  Courts' incorporation of this principle into a generational gc 
  [CACM, Sept. 88].)

  I'm not saying incremental gc's are cheap, but I'm pretty sure
  nobody's done all of the relevant experiments.

Paul R. Wilson                         
Software Systems Laboratory               lab ph.: (312) 996-9216
U. of Illin. at C. EECS Dept. (M/C 154)   wilson@carcoar.stanford.edu
Box 4348   Chicago,IL 60680