[comp.lang.lisp] Memory Management in Lisp?

srt@aero.org (Scott "TCB" Turner) (02/16/91)

In building large systems in Lisp, I am consistently torn between (1)
making free use of Lisp's memory allocation and automatic garbage
collection to make my life as a developer easy and (2) handling 
garbage explicitly in hopes of creating a system that doesn't spend
all its effort thrashing garbage.

In T, I was aided somewhat in this conflict by weak reference ("pools")
which I could use to recycle objects freely, knowing that they could
still be GCed if they had no other reference.  Common Lisp lacks weak
reference, so this is no longer possible.

Will weak reference be added to Common Lisp?  Or even better, will some
kind of user memory management be added?

						-- Scott Turner

barmar@think.com (Barry Margolin) (02/16/91)

In article <1991Feb15.191259.20090@aero.org> srt@aero.org (Scott "TCB" Turner) writes:
>Will weak reference be added to Common Lisp?  

Not in the current round.  Only a few Common Lisp implementations currently
have such mechanisms, and there isn't yet a concensus on the right
interface to them.  Maybe a future, followon standard (Common Lisp 2000?)
will have them, because many of us acknowledge that they are useful.

>					       Or even better, will some
>kind of user memory management be added?

There have been no proposals made to X3J13 for any such mechanism, and it's
too late now for new features.  I'm not even sure how good a portable
memory management mechanism could be in a Lisp environment.  Could you
describe the kind of interface you're hoping for?  Something reasonably
simple like resources (pools of similar objects, useful when you're
repeatedly allocating and deallocating objects, as you reuse objects rather
than letting them become garbage), or something more complicated like areas
(sections of memory with different garbage collection parameters).

Note that Common Lisp currently specifies virtually nothing about memory
management.  CLtL doesn't use the term "garbage collection" at all, and
CLtL2 only mentions it in passing (in the description of the DYNAMIC-EXTENT
declaration).  An implementation without a garbage collector would conform
to the spec; in fact, for several years most MIT Lisp Machine users
disabled the garbage collector because the original implementation was very
buggy -- they simply rebooted when they ran out of memory.

--
Barry Margolin, Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar

srt@aero.org (Scott "TCB" Turner) (02/19/91)

(Barry Margolin) writes:
>I'm not even sure how good a portable memory management mechanism
>could be in a Lisp environment.  Could you describe the kind of
>interface you're hoping for?

I haven't really given it much thought.  But it has always seemed
strange to me that Lisp in general gives little control over memory
management to the user, when it is so obviously critical to system
performance.

I've been thinking about the subject since my original post, and one
facility I think might be interesting is a garbage collection advisor.
Currently gc collects an object if it has no references.  The idea
here is to let gc collect an object if the (user-supplied) advisor
predicate returns true.

						-- Scott Turner

barmar@think.com (Barry Margolin) (02/19/91)

In article <1991Feb18.191549.7575@aero.org> srt@aero.org (Scott "TCB" Turner) writes:
>I haven't really given it much thought.  But it has always seemed
>strange to me that Lisp in general gives little control over memory
>management to the user, when it is so obviously critical to system
>performance.

Traditionally, one of the most important differences between Lisp and most
other languages has been the fact that memory management is automatic.
Even EVAL isn't as fundamental to Lisp as I once thought, since Scheme
doesn't have it (although many *implementations* of Scheme add it as an
extension), but it does have automatic memory management.

>I've been thinking about the subject since my original post, and one
>facility I think might be interesting is a garbage collection advisor.
>Currently gc collects an object if it has no references.  The idea
>here is to let gc collect an object if the (user-supplied) advisor
>predicate returns true.

An interesting idea, but potentially very dangerous.  An object may have
references that the application isn't aware of.  For instance, the user
could have assigned an object to a global variable while in the debugger.
Or in a Dynamic Windows or CLIM environment, just printing out the object
creates a new reference to the object from the window's output history
structure.

A similar caveat could be made about objects allocated on the stack (as
ANSI CL will permit, using the DYNAMIC-EXTENT declaration).  However, in
this case, the routine that saves away the value could check whether it is
stack-allocated, and move it to the heap first (that's what Dynamic Windows
does).  However, in your example, the GC advisor runs *after* the reference
has been made.
--
Barry Margolin, Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar

bill@ibmpcug.co.uk (Bill Birch) (02/20/91)

> An interesting idea, but potentially very dangerous.  An object may have
> references that the application isn't aware of.  For instance, the user
> could have assigned an object to a global variable while in the debugger.
> Or in a Dynamic Windows or CLIM environment, just printing out the object
> creates a new reference to the object from the window's output history
> structure.
> 
This is a function of the way garbage collection is arranged. A mark-
sweep system has problems with the above.However, in a reference-
counting LISP system, the total number of references to any object is
known by definition. Thus it is possible to allow the application
to garbage collect explicitly. 

For example I have an interpreter
that allows the programmer to read the reference count of an object.
Also , to force collection of an unused tree hanging off a symbol can 
simply be done by : (SETQ FOOBAR NIL) 

Since the reference count on
the contents of FOOBAR will have dropped to zero, the memory
is automatically reclaimed.

Bill
-- 
Automatic Disclaimer:
The views expressed above are those of the author alone and may not
represent the views of the IBM PC User Group.
-- 

barmar@think.com (Barry Margolin) (02/21/91)

In article <1991Feb20.150947.2039@ibmpcug.co.uk> bill@ibmpcug.co.uk (Bill Birch) writes:
>> An interesting idea, but potentially very dangerous.  An object may have
>> references that the application isn't aware of.
>This is a function of the way garbage collection is arranged. A mark-
>sweep system has problems with the above.However, in a reference-
>counting LISP system, the total number of references to any object is
>known by definition. Thus it is possible to allow the application
>to garbage collect explicitly. 

If the application checks the reference counts, it is duplicating the
function of the regular garbage collector, so what's the benefit?  I was
responding to a proposal for a mechanism where the application could say to
the system, "ignore whatever you think the state of this object is -- I
know a priori that it is garbage."  In a reference count system, I assume
this would be a function that sets the reference count of the object to
zero, and decrements the reference counts of the objects that it
references.

I suppose the application could check whether the object's reference count
is higher than it expects, and assume that this indicates that there are
"extra" references, in which case it shouldn't forcibly GC it.  But I
believe that an application that knows the expected reference counts
exactly probably also knows where the references come from, so it would be
just as easy to set those references to NIL, which will cause the reference
counts to decrement.
--
Barry Margolin, Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar

jeff@aiai.ed.ac.uk (Jeff Dalton) (02/25/91)

In article <PCG.91Feb23162955@odin.cs.aber.ac.uk> pcg@cs.aber.ac.uk (Piercarlo Grandi) writes:

>This is a very untraditional concern. It used to be fairly true, as one
>quote puts it, that "Lisp programmers know the value of everything but
>the cost of nothing" :-).

That must be why they never bothered to write compilers :-).

>barmar> Traditionally, one of the most important differences between
>barmar> Lisp and most other languages has been the fact that memory
>barmar> management is automatic.
>
>Unfortunately this has traditionally encouraged sloppiness.  For example
>many functions, where even without this simple technique garbage
>generation should be zero, are sloppily coded. Part of it may be desire
>to aboit using 'set' or 'rplac' variants, part of it is just
    avoid ?
>carelessness.

Well, of course many functions are sloppily coded.  This is a problem
in every programming language.  But I don't count a lack of attention
to object lifetimes as part of the sloppiness except in certain cases.
More often automatic storage management makes it easier to write code
that is clearer and in a sense more abstract.

>This is simply because Lisp programmers have always been offered,
>because they were doing "research", the largest machines around and
>resource consumption has never been a problem, until they started having
>to do delivery. 

I think this is true to a large extent, although resource consumption
_has_ often been recognized as a problem.

>                 But unfortunately just like the nerds in Unixland don't
>know anything about locality of reference, the hackers in Lispland don't
>even think about garbage generation rates. I am sure that you, an old
>Multics fan, can weep with me for hours about both attitudes.

A number of Lisp hackers _did_ think about garbage generation rates
when it was appropriate to do so.

>barmar> Even EVAL isn't as fundamental to Lisp as I once thought, since
>barmar> Scheme doesn't have it [ ... ]
>
>The entire Scheme report is really about defining the semantics of
>'eval'. 

I hope you're not supposing that Barmar didn't know this.

>It does not have a way to invoke recursively the 'eval', but
>that is probably best left as an implementation issue.

There are programs that can use it and they cannot be written in
portable Scheme.  There are good reasons for leaving EVAL out of
Scheme, but they are not so good that all Lisps should do the same.
I don't see how any of this makes it an implementation issue.

>Also, Scheme is not a Lisp -- it is an Algorithmic Language :-).

Single inheritiance thinking rides again.  [Common Lisp is a Lisp but
not supposedly an "Algorithmic language".  Scheme is an Algorithmic
language.  Since Algorithmic language is not a subcategory of Lisp,
and Lisp is not a subcategory of Algorithmic language (because of
such Lisps as Common Lisp), it must be that Scheme is not a Lisp.]

-- jd

jinx@zurich.ai.mit.edu (Guillermo J. Rozas) (02/27/91)

    To rely on garbage collection, which is a worst case mechanism to use to
    reclaim storage when nothing is known a priori about value lifetime, for
    known lifetime cases is clearly wasteful. An analogy is using dynamic
    type dispatching when the type of a value is well known by design. Maybe
    we need sophisticated, checkable lifetime declarations (better than
    DYNAMIC-EXTENT), not just sophisticated, checkable, type declarations.


Hmm.  So you would argue for programming with overlays rather than
using a virtual memory system?  Or having programmers think about disk
sectors, head motion, and queueing operations on the disk rather than
using the OS-supplied facilities that hide such details.

Of course, there are applications where performance is so critical
that you cannot use virtual memory, or ignore details of disk access,
etc., but the vast majority can.  Programmer productivity is greatly
enhanced and the likelyhood of bugs is greatly decreased by using the
system-supplied facilities, albeit at a cost in performance.

I think automatic storage managment has the same characteristics as
these OS services, and should be avoided only under duress.  That does
not mean, however, that other facilities should not be made available
as well.

alderson@leland.stanford.edu (Rich Alderson) (03/01/91)

In article <1991Feb21.102925.25226@Think.COM>, barmar@think (Barry Margolin) writes:
>If the application checks the reference counts, it is duplicating the function
>of the regular garbage collector, so what's the benefit?  I was responding to
>a proposal for a mechanism where the application could say to the system,
>"ignore whatever you think the state of this object is -- I know a priori that
>it is garbage."  In a reference count system, I assume this would be a
>function that sets the reference count of the object to zero, and decrements
>the reference counts of the objects that it references.

Interesting.  I read the suggestion the other way:  That GC might think an
object was garbage, but should leave the object alone unless the advisor said
GC could reap it.  That was in tune with the idea of explicitly re-using
objects rather than letting them be GCed.

Rich Alderson
alderson@leland.stanford.edu

oz@yunexus.yorku.ca (Ozan Yigit) (03/02/91)

In article <JINX.91Feb26214634@chamarti.ai.mit.edu> jinx@zurich.ai.mit.edu writes:
     [quoting Piercarlo Grandi]
>
>    To rely on garbage collection, which is a worst case mechanism to use to
>    reclaim storage when nothing is known a priori about value lifetime, for
>    known lifetime cases is clearly wasteful. An analogy is using dynamic
>    type dispatching when the type of a value is well known by design. Maybe
>    we need sophisticated, checkable lifetime declarations (better than
>    DYNAMIC-EXTENT), not just sophisticated, checkable, type declarations.
>
>
>Hmm.  So you would argue for programming with overlays rather than
>using a virtual memory system?  Or having programmers think about disk
>sectors, head motion, and queueing operations on the disk rather than
>using the OS-supplied facilities that hide such details.

This must be the usual gang-up-on-Piercarlo week ;-)

I think your interpretation makes Piercarlo's views appear much more
radical than they actually are. Heck, you know the type of analsis that
goes into scheme compilers [for example] to avoid heap allocation
[McDermott, Dybvig etc] better than I do. Aren't nemory allocation strategies
based on lifetimes [Hanson] and Generational GC mechanisms are in essence
use exactly the idea of making use of some knowledge about the extent of
things? [Do weak references count as a similar mechanism?]

So, what is so offensive about his suggestions?

oz
---
In seeking the unattainable, simplicity  |  Internet: oz@nexus.yorku.ca
only gets in the way. -- Alan J. Perlis  |  Uucp: utai/utzoo!yunexus!oz

jinx@zurich.ai.mit.edu (Guillermo J. Rozas) (03/03/91)

In article <21797@yunexus.YorkU.CA> oz@yunexus.yorku.ca (Ozan Yigit) writes:

   Path: ai-lab!snorkelwacker.mit.edu!shelby!unix!Teknowledge.COM!uw-beaver!ubc-cs!news-server.csri.toronto.edu!helios.physics.utoronto.ca!ists!yunexus!oz
   From: oz@yunexus.yorku.ca (Ozan Yigit)
   Newsgroups: comp.lang.lisp
   Date: 2 Mar 91 04:43:36 GMT
   References: <1991Feb15.191259.20090@aero.org> <1991Feb15.223520.17267@Think.COM> <1991Feb18.191549.7575@aero.org> <1991Feb19.030719.1137@Think.COM> <PCG.91Feb23162955@odin.cs.aber.ac.uk> <JINX.91Feb26214634@chamarti.ai.mit.edu>
   Sender: news@yunexus.YorkU.CA
   Organization: York U. Communications Research & Development
   Lines: 32

   In article <JINX.91Feb26214634@chamarti.ai.mit.edu> jinx@zurich.ai.mit.edu writes:
	[quoting Piercarlo Grandi]
   >
   >    To rely on garbage collection, which is a worst case mechanism to use to
   >    reclaim storage when nothing is known a priori about value lifetime, for
   >    known lifetime cases is clearly wasteful. An analogy is using dynamic
   >    type dispatching when the type of a value is well known by design. Maybe
   >    we need sophisticated, checkable lifetime declarations (better than
   >    DYNAMIC-EXTENT), not just sophisticated, checkable, type declarations.
   >
   >
   >Hmm.  So you would argue for programming with overlays rather than
   >using a virtual memory system?  Or having programmers think about disk
   >sectors, head motion, and queueing operations on the disk rather than
   >using the OS-supplied facilities that hide such details.

   This must be the usual gang-up-on-Piercarlo week ;-)

   I think your interpretation makes Piercarlo's views appear much more
   radical than they actually are. Heck, you know the type of analsis that
   goes into scheme compilers [for example] to avoid heap allocation
   [McDermott, Dybvig etc] better than I do. Aren't nemory allocation strategies
   based on lifetimes [Hanson] and Generational GC mechanisms are in essence
   use exactly the idea of making use of some knowledge about the extent of
   things? [Do weak references count as a similar mechanism?]

   So, what is so offensive about his suggestions?

I certainly want Lisp systems to provide quality memory management
just like I want my OS to provide good virtual memory and disk IO.

That does not imply, however, that most (or any) users should be doing
memory management themselves any more than it implies that users
should be bypassing the virtual memory system or the system's supplied
IO facilities.

I agree that Lisp memory management needs to improve, but I doubt that
the way to do this is to throw it out the window and revert to C-style
explicit memory management.  Explicit memory management is very
error-prone, and (in my case) reduces my productivity significantly.
I am much less likely to trust a program that manages memory
explicitly than one that does not.

Having said this, I will repeat what has already been said.  It is
trivial to do ``malloc/free'' memory management in Lisp.  No one stops
you from doing it.  It is also trivial to make it extent dependent:

(define-macro (with-dynamic-pair name x y . body)
  (let ((var (gensym)))
    `(let ((,name (pair/cons ,x ,y)))
       (let ((,var (begin ,@body)))
	 (pair/free ,name)
	 ,var))))

(define pair/malloc)
(define pair/free)
		
(let ((free-list '()))
  (set! pair/malloc
	(lambda (x y)
	  (if (null? free-list)
	      (cons x y)
	      (let ((pair (car free-list)))
		(set! free-list (cdr free-list))
		(set-car! pair x)
		(set-cdr! pair y)
		pair))))
  (set! pair/free
	(lambda (pair)
	  (if (not (pair? pair))
	      (error "pair/free: Not a pair" pair)
	      (begin
		(set-cdr! pair free-list)
		(set-car! pair #f)
		(set! free-list pair))))))

Note that if weak pairs are used to hold the free list, then this
method mixes well with garbage collection.

pcg@cs.aber.ac.uk (Piercarlo Antonio Grandi) (03/05/91)

[ apologies if this appear more than once; previous attempts at posting
may have not been successful because of a full partition ]

I had written:

pcg> To rely on garbage collection, which is a worst case mechanism to use to
pcg> reclaim storage when nothing is known a priori about value lifetime, for
pcg> known lifetime cases is clearly wasteful. An analogy is using dynamic
pcg> type dispatching when the type of a value is well known by design. Maybe
pcg> we need sophisticated, checkable lifetime declarations (better than
pcg> DYNAMIC-EXTENT), not just sophisticated, checkable, type declarations.

On 27 Feb 91 02:46:34 GMT, jinx@zurich.ai.mit.edu (Guillermo J. Rozas)
commented:

jinx> Hmm.  So you would argue for programming with overlays rather than
jinx> using a virtual memory system?

Not at all, a well designed VM system is more flexible than overlays.

But surely I think that over reliance on the default demand policies of
a VM subsystem is catastrophic for performance, as amply demonstrated by
any major application you can find around nowadays. When SunOS 3 became
SunOS 4 a colossal performance hit was experienced because the default
demand loading policy of VM is simply inappropriate to file access. GNU
Emacs is another major and catastrophic example.

jinx> Or having programmers think about disk sectors, head motion, and
jinx> queueing operations on the disk rather than using the OS-supplied
jinx> facilities that hide such details.

Yes, most definitely yes. As long as the latency or bandwidth of disk
and RAM are separated by some orders of magnitude most programmers have
to carefully structure their programs around this simple fact.

I have this idea that a very large, very large part of Computer Science
is simply the consequence of dealing with this gap. Certainly most of OS
and DBMS design is centered around it. Minimizing IO operations and
memory usage is a large and important worry of any programmer's work.

The OS-supplied facilities hide the ugly details of dealing with such
facts; but the unpleasant facts remain, that application growth keeps
pace with memory growth, and that disks are so slow, and bite hard.

jinx> Of course, there are applications where performance is so critical
jinx> that you cannot use virtual memory, or ignore details of disk
jinx> access, etc., but the vast majority can.

Here we are really on different planets. You sound like a vendor, or an
agent of the Imperial MITI DRAM Service.

Most nearly any "serious" application cannot be wholly memory resident
and cannot be run with frequent collections; most any serious
application writer has to be damn careful about locality of reference,
and has to be damn careful about garbage generation rates. Relying on
default replacement and reclamation policies simply doesn't cut it.

The definition of "serious" is more or less the same as "mainstream". If
one is unwilling to let Lisp/Scheme out in the mainstream then
everything is simpler. But when you start considering for example as
languages in which to do databases, numerical applications, compilers,
operating systems, text processors and the like, caring about locality
of reference and garbage generation rates pays off a lot.

In short, do we want for Lisp/Scheme to remain a ghetto language for
doing (even large scale!) AI demos or do we want to use Lisp/Scheme for
programming ? :-)

If the latter is desired, engineering considerations become important.

As to me, I have been considering writing an OS kernel in Scheme, and so
on. Current Scheme implementations are in the Lisp mould, unfortunately,
but I can easily imagine Scheme being usable for all the above
applications (something that is hard to imagine, even if it is feasible,
for Common Lisp), even on a PC/AT class machine.

jinx> Programmer productivity is greatly enhanced and the likelyhood of
jinx> bugs is greatly decreased by using the system-supplied facilities,
jinx> albeit at a cost in performance.

I remember the old saw that says "an engineer can do for a dime what any
fool can do for a dollar". If you have no resource constraints, more
power to less prepared people, or more power to people that are more
prepared in other fields.

Admittedly as long as one has to write programs that are a few thousand
lines long and run on immense personal workstations, there are no
effective resource constraints. GNU Emacs is popular because its users
all have 4-8MB each in their workstations. XyWrite is another story...

In another newsgroup there was a remark that the exceptional success of
Xenix/286 (yes, the 16 bit version) is due to the fact that a lot of
VARs and their customers have discovered that a $1,000 286 gizmo can
support 12 terminals doing business applications simply because it is
the leanest and most efficient of its breed. Orders of magnitude away
from Lisp, but amusing to note nonetheless.

Nowadays that Lisp images are dozens of MBytes long and that people
aspire to deliver them as products running on inexpensive machines,
things are not longer that simple.  Some Lisp programmers have had, much
to their annoyance, to start reading books about IO operation
minimizations written by database people. Garbage generations
consideration have had to be discovered again. A promising start has
been made towards putting Lisp programs out of their workspace ghetto
and making them modular standalone applications that can interact with
things like databases and other tools.
--
Piercarlo Grandi                   | ARPA: pcg%uk.ac.aber@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth        | UUCP: ...!mcsun!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@aber.ac.uk

barmar@think.com (Barry Margolin) (03/05/91)

In article <PCG.91Mar4211533@aberdb.cs.aber.ac.uk> pcg@cs.aber.ac.uk (Piercarlo Antonio Grandi) writes:
>I have this idea that a very large, very large part of Computer Science
>is simply the consequence of dealing with this gap. Certainly most of OS
>and DBMS design is centered around it. Minimizing IO operations and
>memory usage is a large and important worry of any programmer's work.

And if we didn't have OSes hiding these details then a large part of
Computer Science would be programming I/O and memory management routines.
And we'd still have to worry about minimizing I/O operations, because they
are still orders of magnitude slower than in-memory operations.

If the programs don't get written, it doesn't matter how little memory they
run in.

--
Barry Margolin, Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar

pcg@cs.aber.ac.uk (Piercarlo Antonio Grandi) (03/07/91)

On 5 Mar 91 04:03:48 GMT, barmar@think.com (Barry Margolin) said:

barmar> In article <PCG.91Mar4211533@aberdb.cs.aber.ac.uk>
barmar> pcg@cs.aber.ac.uk (Piercarlo Antonio Grandi) writes:

pcg> I have this idea that a very large, very large part of Computer Science
pcg> is simply the consequence of dealing with this gap. Certainly most of OS
pcg> and DBMS design is centered around it. Minimizing IO operations and
pcg> memory usage is a large and important worry of any programmer's work.

barmar> And if we didn't have OSes hiding these details then a large
barmar> part of Computer Science would be programming I/O and memory
barmar> management routines.

Just a moment, this is not a fair comment; I don't regret that OSes 
and other sw layers give programmers a highger level view of certain
things, because what is being abstracted over is details.

In the view of some this is indeed the same thing as abstracting over
essential things like the performance profile of the available assets,
but surely not to me.

barmar> And we'd still have to worry about minimizing I/O operations,
barmar> because they are still orders of magnitude slower than in-memory
barmar> operations.

No, no. With more abstract interfaces (like VM and memory mapped files)
we can *concentrate* just on minimizing IO operations. Unless we are
Lisp programmers, that is :-).

I really don't see why so many people would die tomorrow for reducing by
10% the time complexity of their code, but are indifferent to reducing
the space or IO complexity by an order of magnitude.

If your attitude were right, then why bother with any sort algorithm
other than bubble sort? It gets the job done, and if it is too slow, add
more CPUs (actually a quadratic number of CPUs :->).
--
Piercarlo Grandi                   | ARPA: pcg%uk.ac.aber@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth        | UUCP: ...!mcsun!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@aber.ac.uk

barmar@think.com (Barry Margolin) (03/08/91)

In article <PCG.91Mar6215111@aberdb.cs.aber.ac.uk> pcg@cs.aber.ac.uk (Piercarlo Antonio Grandi) writes:
>If your attitude were right, then why bother with any sort algorithm
>other than bubble sort? It gets the job done, and if it is too slow, add
>more CPUs (actually a quadratic number of CPUs :->).

[Look at what company I work for -- we *do* add more CPUs :->]

It's been a long time since I've had to write a sort subroutine, but I
think 90% of the ones I've written have been bubble sorts.  I depend on the
environment providing a sorting function that has reasonable performance.

Similarly, I've never written a routine that optimizes disk r/w head
motion; if the OS doesn't optimize it, it doesn't get done.

If it's easy to optimize something in my code and it doesn't obscure the
structure, I'll do it.  For instance, I use NREVERSE and NCONC when it's
clear that the old structure of the list can be reused (generally when it
is a newly-consed intermediate value).  I'll use Symbolics's STACK-LET (in
ANSI Common Lisp this will be the DYNAMIC-EXTENT declaration) for local
temporaries in an inner loop.  Other than simple stuff like this, I don't
believe in heavy source optimizing except for major bottlenecks.

--
Barry Margolin, Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (03/11/91)

I note that T has "pools", and that if you have "weak references" it is
easy to implement them in user code, and if you haven't, it's not _that_
hard to tell a garbage collector how to flush them.

The basic idea of a pool (from memory, so the interface is _wrong_):
	(define my-pool (create-pool ALLOCATOR))
	;; ALLOCATOR is a function.  my-pool is a "weak set" of objects.
	(pool-allocate my-pool)
	;; if my-pool is not empty, remove and return any one of the
	;; objects in it.  if my-pool is empty, call (ALLOCATOR) and
	;; return its result.
	(pool-release my-pool object)
	;; adds the object to my-pool, so that pool-allocate may return it
When a garbage collection happens, every pool is emptied, but an object in
a pool will not be reclaimed if there are any "hard" references to it.

By using pools, you can recycle objects faster than the garbage collector;
the price is that you may accidentally free something that is still in use.

I further note that Lisp Machines have had "areas" for a long time, so that
providing some kind of control over locality is not entirely out of the
question.  I suppose the real question is whether anyone succeeded in
getting useful speedups by using it.

I note that the compilers on the Burroughs B6700 made it easy for a
programmer to control which Algol procedures, Fortran program units,
or COBOL paragraphs were placed together in what segments, but I never
knew anyone bother much, even though the concept of grouping was clear
and the means simple and clearly documented; it was rare for people to
know which program units belonged together.

Should good programmers spend a lot of time thinking about storage
allocation and placement, or should they spend their time thinking
about how to reduce the amount of storage they need so that memory
management ceases to be a bottle-neck?

> I really don't see why so many people would die tomorrow for reducing by
> 10% the time complexity of their code, but are indifferent to reducing
> the space or IO complexity by an order of magnitude.

My attitude is that reducing the amount of space used *IS* how I make
my programs run faster, but that the way to do that is not to give
myself more scope for mistakes by explicitly managing the intermediate
objects, but instead by designing my algorithms so that they create as
few intermediate objects as possible in the first place.
-- 
The purpose of advertising is to destroy the freedom of the market.

srt@aero.org (Scott "TCB" Turner) (03/12/91)

(Richard A. O'Keefe) writes:
>I note that T has "pools", and that if you have "weak references" it is
>easy to implement them in user code, and if you haven't, it's not _that_
>hard to tell a garbage collector how to flush them.

The circular discussion is an infamous Usenet phenomenon.  (The "Memory 
Management" discussion started when I mentioned T's weak reference and
asked if anything similar would ever be in CL.)

Perhaps you could post some code showing how to implement weak reference
or pools in Common Lisp?

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (03/18/91)

In article <1991Mar11.173156.24976@aero.org>, srt@aero.org (Scott "TCB" Turner) writes:
> (Richard A. O'Keefe) writes:
> >I note that T has "pools", and that if you have "weak references" it is
> >easy to implement them in user code, and if you haven't, it's not _that_
> >hard to tell a garbage collector how to flush them.
> 
> The circular discussion is an infamous Usenet phenomenon.  (The "Memory 
> Management" discussion started when I mentioned T's weak reference and
> asked if anything similar would ever be in CL.)
> 
> Perhaps you could post some code showing how to implement weak reference
> or pools in Common Lisp?

Common Lisp, by design, does not include all the low level operations
one would need.  Given any Common Lisp *implementation*, I doubt that
it would be hard to do.  Have I any evidence for this claim?

It took me all of one hour to add weak references to Elk,
and a further half hour to implement pools on top of weak references.
This was the very first time I had ever added a data type to Elk.

Alas, my operations are not quite the same as the ones in T.  If anyone
is really interested, I am willing to post my two new files
	elk/lib/weak.c (143 lines, about 40 are comments)
	elk/scm/pool   (47 lines, about half of that is comments)
	[no changes are required to any existing file]

Now, Elk is unusually easy to augment.  (Was that "extension language
kit" or "extensible language kit" (:-)?)  But T doesn't need a lot of
code to implement weak references either.  The secret in Elk is a
function that looks like

	void Visit_Weak_Ref(hp, f)
	    Object *hp;
	    int (*f)();
	    {
		struct S_Weak *p = WEAK(*hp);
		p->curval = p->defalt;
		(*f)(&(p->defalt));
	    }

When the Elk garbage collector is traversing the non-garbage, user-
defined types are traversed by a user-defined function.  When you
create a new type, you specify the function.  This is the only part
of my implementation of weak references which differs significantly
from pairs.  (The printing function is different too, of course.)
-- 
Seen from an MVS perspective, UNIX and MS-DOS are hard to tell apart.

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (03/19/91)

In article <4993@goanna.cs.rmit.oz.au>,
I wrote, in defence of a claim that pools were easy to add,
that it had taken me very little time to write the code to add
weak references (not weak pointers!) and pools to Elk, and I
offered to post the code.  I have received several E-mail requests,
and the code _is_ small, so here goes.  The code _has_ been tested,
and appears to work.

#!/bin/sh
# This is a shell archive, meaning:
# 1. Remove everything above the #!/bin/sh line.
# 2. Save the resulting text in a file.
# 3. Execute the file with /bin/sh (not csh) to create the files:
#	scm/pool
#	lib/weak.c
cat >scm/pool <<'------ EOF ------'
;;; -*-Scheme-*-
;;; 
;;; Scheme provides automatic garbage collection.  However, sometimes
;;; you know early that an object of a particular type will not be
;;; used again, so you would like to make it available for re-use.
;;; 
;;; This file provides three functions:
;;;	(make-pool allocator) => pool
;;;	(allocate pool) => object
;;;	(release pool object) => unspecified
;;; The idea is that a pool consists of a list of available objects and
;;; a function (the allocator) for allocating and initialising new ones.
;;; When you try to allocate an object from the pool, if there are any
;;; available objects it will return one of them.  If there aren't any,
;;; it will call the allocator to make a new one.
;;; When you have finished with an object, you can add it to the pool
;;; by calling release.
;;; When a garbage collection occurs, every pool is forcibly emptied.
;;; If there are other references to an object in a pool, it will
;;; survive, so this is quite safe.
;;; Using this package can save a fair bit of garbage collection.
;;; You will never get your hands on invalid pointers.  On the other
;;; hand, you had better be *sure* that you have finished with an
;;; object before putting it back in a pool.

;;; The representation of a pool is a pair
;;;	(<allocation function> . <weak reference to list of objects>)

(define (make-pool allocator)
    (cons allocator (cons-weak-ref '() '()) ))

(define (pool? object)
    (and (pair? object)
	 (procedure? (car object))
	 (weak-ref? (cdr object))
	 (null? (weak-default (cdr object)) )) )

(define (allocate pool)
    (let ((available (weak-contents (cdr pool))))
	(if (null? available) ((car pool))
	    (begin (weak-set-contents! (cdr pool) (cdr available))
		   (car available)) )))

(define (release pool object)
    (weak-set-contents! (cdr pool)
	(cons object (weak-contents (cdr pool)) )))

------ EOF ------
ls -l scm/pool
cat >lib/weak.c <<'------ EOF ------'
#include <scheme.h>

/*  weak.c defines a type "weak-reference" with operations
    (cons-weak-ref [default [initial]])
	-- if initial is omitted, it is the same as default
	-- if default is omitted, it is #F
    (weak-ref? object)
    (weak-contents weak-ref)
	-- returns the current value of the weak-ref
    (weak-default weak-ref)
	-- returns the default value of the object
    (weak-set-contents! weak-ref value)
	-- updates the current value of the object
    (weak-set-default! weak-ref value)
	-- updates the default value of the object
    A weak reference is just like a pair except that when a garbage
    collection occurs, the current value is replaced by the default
    value.  The point of this is to let you define "pools".
*/

static int T_Weak;

#define WEAK(x)   ((struct S_Weak *)POINTER(x))

struct S_Weak {
    Object defalt;
    Object curval;
};

static Object P_Weak_Cons(argc, argv)
    int argc;
    Object *argv;
    {
	Object defalt = argc < 1 ? False : argv[0];
	Object curval = argc < 2 ? defalt : argv[1];
	register char *p;
	Object h;
	GC_Node2;

	GC_Link2(defalt, curval);
	p = Get_Bytes(sizeof (struct S_Weak));
	SET(h, T_Weak, (struct S_Weak *)p);
	WEAK(h)->defalt = defalt;
	WEAK(h)->curval = curval;
	GC_Unlink;
	return h;
    }

static Object P_Weakp(x)
    Object x;
    {
	return TYPE(x) == T_Weak ? True : False;
    }

static Object P_Weak_Contents(h)
    Object h;
    {
	Check_Type(h, T_Weak);
	return WEAK(h)->curval;
    }

static Object P_Weak_Default(h)
    Object h;
    {
	Check_Type(h, T_Weak);
	return WEAK(h)->defalt;
    }

static Object P_Weak_Set_Cont(h, val)
    Object h, val;
    {
	Check_Type(h, T_Weak);
	WEAK(h)->curval = val;
	return h;
    }

static Object P_Weak_Set_Dflt(h, val)
    Object h, val;
    {
	Check_Type(h, T_Weak);
	WEAK(h)->defalt = val;
	return h;
    }


static int Weak_Eqv(a, b)
    Object a, b;
    {
	return EQ(a, b);
    }

static int Weak_Equal(a, b)
    Object a, b;
    {
	return	Equal(WEAK(a)->defalt, WEAK(b)->defalt)  &&
		Equal(WEAK(a)->curval, WEAK(b)->curval);
    }

static Weak_Print(h, port, raw, depth, length)
    Object h, port;
    int raw, depth, length;
    {
	Printf(port, "#[hunk3 %u: ", POINTER(h));
	Print_Object(WEAK(h)->defalt, port, raw, depth-1, length);
	Printf(port, "]");
    }

static Weak_Visit(hp, f)
    Object *hp;
    int (*f)();
    {
	struct S_Weak *p = WEAK(*hp);
	p->curval = p->defalt;
	(*f)(&(p->defalt));
    }

init_lib_weak()
    {
	T_Weak = Define_Type(0, "weak-ref", NOFUNC, sizeof (struct S_Weak),
			     Weak_Eqv, Weak_Equal, Weak_Print, Weak_Visit);
	Define_Primitive(P_Weak_Cons,	  "cons-weak-ref",      0, 2, VARARGS);
	Define_Primitive(P_Weakp,	  "weak-ref?",	        1, 1, EVAL);
	Define_Primitive(P_Weak_Contents, "weak-contents",      1, 1, EVAL);
	Define_Primitive(P_Weak_Default,  "weak-default",       1, 1, EVAL);
	Define_Primitive(P_Weak_Set_Cont, "weak-set-contents!", 2, 2, EVAL);
	Define_Primitive(P_Weak_Set_Dflt, "weak-set-default!",  2, 2, EVAL);
    }

------ EOF ------
ls -l lib/weak.c
exit
-- 
Seen from an MVS perspective, UNIX and MS-DOS are hard to tell apart.

net@opal.cs.tu-berlin.de (Oliver Laumann) (03/19/91)

In article <4993@goanna.cs.rmit.oz.au> ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) writes:
> Now, Elk is unusually easy to augment.  (Was that "extension language
> kit" or "extensible language kit" (:-)?)

When I had to invent a name for the beast, I also considered "eel"
(extensible extension language) :-)