[comp.lang.clos] destructors in CLOS?

creon@nas.nasa.gov (Creon C. Levit) (12/19/90)

I would like to define a method that gets called when a CLOS instance
gets garbage collected, sort of like a destructor method in c++.

Can I do this in CLOS?  How?



--
Creon Levit

mail stop T045-1
NASA Ames Research Center
Moffett Field, California 94035

(415)-604-4403

creon@nas.nasa.gov  (Internet)

gumby@Cygnus.COM (David Vinayak Wallace) (12/20/90)

   Date: 18 Dec 90 20:17:09 GMT
   From: creon@nas.nasa.gov (Creon C. Levit)

   I would like to define a method that gets called when a CLOS instance
   gets garbage collected, sort of like a destructor method in c++.

Creon,

What do you want to happen?  For instance, an object may never BE
collected, since the GC may never run.

g

barmar@think.com (Barry Margolin) (12/20/90)

In article <GUMBY.90Dec19092440@Cygnus.COM> gumby@Cygnus.COM (David Vinayak Wallace) writes:
>   From: creon@nas.nasa.gov (Creon C. Levit)
>
>   I would like to define a method that gets called when a CLOS instance
>   gets garbage collected, sort of like a destructor method in c++.

>What do you want to happen?  For instance, an object may never BE
>collected, since the GC may never run.

That's a good point.  I expect that an example of what he's doing is
objects containing references to streams.  If the only reference to a
stream is in a particular instance, and the instance gets GCed, you would
like to know to close the stream.

The design of many modern GCs doesn't make such a feature easy to
implement.  Many systems use some form of copying GC, so the only objects
it sees are the live ones; the garbage gets left in oldspace, which is
then reclaimed in one fell swoop.  Searching through oldspace for all the
garbage instances would significantly slow down the GC; this would end up
as a hybrid of copying and mark-sweep (copy-and-sweep?).

In my experience, most objects that need this kind of special treatment are
amenable to solutions using explicit destruction (e.g. the CLOSE function)
and/or automatic destruction when leaving a dynamic environment (e.g.  the
WITH-OPEN-FILE macro).  Not surprisingly, these are the same operations
that invoke C++ destructors: using the "delete" operator on a pointer
returned by "new", and leaving a block that declared an automatic class
object.  In addition, Lisp generally allows you to invoke an explicit
destructor on an object allocated by a dynamic macro, which I'm not sure is
valid in C++.

--
Barry Margolin, Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar

jonl@LUCID.COM (Jon L White) (12/20/90)

re: I would like to define a method that gets called when a CLOS instance
    gets garbage collected, sort of like a destructor method in c++.

    Can I do this in CLOS?  How?


This question has been asked several times in the past year, the most
recent two instances of which are Eero Simoncelli's query to this very
list last summer, and Roman Cunis's question raised at the CLOS workshop
during ECOOP/OOPSLA '90 last October.

Here is Eero's query, and the only reply to it that I ever saw:

    Date: Wed, 18 Jul 90 16:46:49 PDT
    From: eero@newton.arc.nasa.gov (Eero Simoncelli)
    To: CommonLoops.PA@Xerox.COM
    Subject: Any plans for a finalize-instance method?

    Are there any plans to incorporate a "finalize-instance" method
    (complementing the initialize-instance method) in the CLOS spec?  This
    method would be called after the instance was orphaned (i.e.  all
    references destroyed), but before it was garbage-collected.
    Obviously, this requires an implementation-dependent hook into the
    garbage-collection process.  This would allow cleanup of allocated
    foreign or system structures, static data resources, etc.

    Eero.

    Date: Wed, 18 Jul 90 20:01:39 PDT
    From: smh@franz.com (Steve Haflich)
    To: eero@newton.arc.nasa.gov
    . . .
    Franz's 4.0 release will provide object finalization, as do some other
    implementations, but I doubt X3J13 would agree to require it in the
    language spec on two grounds: First, . . .


At the CLOS workshop, I mentioned that Lucid has developed a finalization
technique independent of CLOS and that it would likely be in the next
release (I have had experimental versions running for some time now, but
I'm not authorized to make product announcements; the current released
product, Lucid 4.0, reached design closure last Spring, and doesn't have
this code in it.)  Chris Richardson, from Franz Inc., also mentioned that
Franz had been working on a finalization technique that would be released
in their upcoming 4.0 version, which he said would be due out in January
or February 1991.

However, you need to compare what "finalization" is likely to mean in
implementations that are not doing true "garbage" collection.  At the
very least, it will not likely be as simple as merely placing a method on
FINALIZE-INSTANCE -- probably it will require some individual registration
of the objects to be finalized, which could be done on a per-class basis
by an initialization(!) method.   In practice, it may feel as simple as a
defmethod, but there are couple of differences that can be observed by
judicious test cases.  This is why I said that "finalization" will be
independent of CLOS.  But of course CLOS objects will be finalizable just
like any other stored-memory object will be; Steve Haflich, in his reply
to Eero above, also stressed this point.

Also, you should be aware of the slight difference between "destructor"
and "finalization"; the former is the code that actually "disassembles"
some no-longer useful object, whereas the latter is the process of
identifying which objects that the "Garbage Collection" process, or
the "Garbage Abandonment" process, should tickle upon when such an object
is discovered to be abandoned.  In the current C++ world, there is no
automatic memory management, so the end-user code itself must invoke
the destructor.  Furthermore, the "destructor" code might typically try
to recycle the space consumed by the object being "destroyed" (such as
by placing it back in a pool of available such objects); but in Lisp,
of course, you typically wouldn't try to thread the piece of garbage
back into a live pool, but would rather let the next cycle of GC pick
it up (on the other hand, you might do it anyhow!)

It is worth noting that some of the participants in the Garbage
Collection Workshop at ECOOP/OOPSLA '90 (myself included) are currently
actively disussing how garbage collection *might* be added to C++.
There is definitely some very serious concern by the more C-oriented
folks in that group that "finalization" (as opposed to destructors) can
never be a precise process.  First, in the worst case scenario, the
C++ world might not be able to tolerate "complete GC" -- that is, they
may be reduced to coexisting with so-called "conservative" garbage
collectors, whose contract is merely that they will (1) try hard to
collect some garbage, (2) and never collect any living structures,
(3) but leave open some reasonable paths in which particular pieces of
garbage  will never be collected, even though the user's program has
dropped all pointers to them.  So it is possible that the requirement
for precise finalization may rule out some otherwise important GC
techniques.

This concern about the ability to prove that something is in fact
garbage can even spill over into the Lisp world.  For example, an
optimizing Lisp compiler will probably use temporary registers in
such a way that a pointer may appear to be "alive" even after the
user's code has just smashed out the last copy of it in his own
data structures.  This seems to be a fact of "life in the big city",
both in the Lisp world and in the C world.  It may not be possible
to reclaim a dead object until you exit from the last function
frame that was referencing it.

Finally, the question arises of just how urgent the demand to
"finalize" a given object must be.  How many seconds after dropping
the last pointer to the object are you willing to wait until the
destructor code is executed?  In Interlisp-D (and presumably in the
Common Lisp product that came out of it in latter years, now marketed
by a company called Venue), this was essentially not a problem, because
it used a reference-counting GC; and usually within a very short time
of the reference count being decremented to zero, there would be an
incremental collection that would run the associated destructor code.
That is, you wouldn't have to wait until consing activity crossed
some threshold and triggered a GC, or until you explicitly called
GC from user code; the reference-counting and incremental triggering
was sufficient.

[Yea, Interlisp-D had a form of finalization back in the early 1980's,
and I even put it into PDP10 MacLisp in the early 1970's for just the
reasons that Eero mentioned above.  But what neither system offered
was a mechanism for the end-user to hook his own datatypes up for
finalization.  These earlier versions were specifically hooked only
to certain system datatypes that connected to "system" data-structures.
Oh yes, I recall the lacuna in the earlier versions of Interlisp-D
where the incremental GC didn't scavenge everything available; we
would have to do something like (RPTQ 20 (GC)) in order to assure
that the last bit of garbage got "licked up".  But I'm fairly certain
that that gap was later plugged.]


-- JonL --

creon@nas.nasa.gov (Creon C. Levit) (12/21/90)

Here is one of the responses to my query, mailed directly to me and
not to comp.lang.clos, that may be of interest to the group.  I also
mention the application that I and some other are working on, and how I
feel could benefit from destructors methods, of "finanize-instance"
methods. 

Arun Welch wrote:

   Right now, no, at least not portably. At this point you can't even
   call a function on any object before it gets GC'd, let alone
   instances. Franz has proposed finalize-instance to handle such a
   case, and JonL and I talked about it at AAAI, but we had a hard
   time coming up with any examples for needing it that weren't
   contrived, so if you've got an honest reason for it, we'd love to
   hear what it is. In general, we were worried about the extra work
   that the GC would have to do, and what sort of performance hit it
   would incur. 


An example, not contrived, though perhaps implementable well enough using the
techniques barmar mentioned, involves using CLOS on a silicon graphics
IRIS (380 VGX).  The operating system on the iris is a version of unix
called IRIX that has several features important for scientific
visualization - our problem domain.  The features of relevance are:

1. Mapped files - there is a system call which maps files into your
address space.  No need to read the data - it gets paged in when you
reference it (a speed win).  There is also no need to malloc space for
it, and it does not contribute to the size of your swappable image (a
space win).  Standard stuff, I know.  We make extensive use of mapped
files when visualizing large (read-only) data sets.

You are supposed to call unmap() when you are done with the mapped file,
since the system can only support a finite number of these mapped
files at any time.  

2. Graphics library windows.  Should atomatically close soon after
they become unreferenced, returning their descriptors to the system
for reuse, and freeing valuable screen space.

3. Yes, Streams.

Any instance invovling 1,2 or 3 seems a good candidate for a
finalize-instance method.



PS - Is anyone extending CLOS for multiprocessing?  The IRIS has
eight processors and OS support for multiple theads w/shared memory.
It would be wonderful to get at it in a standard way.


 

--
Creon Levit

mail stop T045-1
NASA Ames Research Center
Moffett Field, California 94035

(415)-604-4403

creon@nas.nasa.gov  (Internet)

jonl@LUCID.COM (Jon L White) (12/25/90)

re: This registration is easy to do with make-instance/initialize-instance,
    but any such registration procedure has the problem that
    registration/deregistration pairs are an unending source of errors in
    C programs that use malloc/free.   The problem is similar. The problem
    with  a destructor is that it is unclear when a "standard"
    destructor/finalizer should operate.

One of the reasons I sent such a lengthy reply to the original message
was to alert readers to the danger of thinking that "finalization" and
"destructoring" were one and the same.  Indeed, as you point out, a major
source of insidious errors in C programs can be the premature destruction
of non-"dead" data.   However, the basic problem is NOT the pairing of
"registration/deregistration" actions [your term], but rather the inability
of the C programmer ever to be assured that some data structure beyond his
immediate control isn't referencing the "dead-meat" datum.  The beauty of GC
is that "pairing" can be maintained accurately and automatically.


re: Are explicit make/destroy pairs (with all
    their problems) the best we can do in general?

No.  Explicit, paired make/destroy actions can be done in any primitive
language (even FORTRAN?).  But languages with pre-existing automated memory
management -- such as the Lisp family, SmallTalk, APL, and a handful of
others -- can *offer* one bit of information to the destructor client that
wouldn't otherwise easily be  obtainable, namely that no other program data
structure references this object.  That is the essence of "finalization."



-- JonL --