[comp.lang.misc] Garbage collection algorithms

kevin@argosy.UUCP (Kevin S. Van Horn) (01/20/90)

Can someone recommend a book that gives a good treatment of garbage collection
algorithms?  In particular, I am interested in finding a good garbage
collecting algorithm to be implemented on a PC (hence no virtual memory),
which must handle variable-sized cells and do compaction when needed.

------------------------------------------------------------------------------
Kevin S. Van Horn            | The means determine the ends.
kevin@argosy.maspar.com      |

wilson@carcoar.Stanford.EDU (Paul Wilson) (01/22/90)

In article <366@argosy.UUCP> kevin@argosy.maspar.com (Kevin S. Van Horn) writes:
>Can someone recommend a book that gives a good treatment of garbage collection
>algorithms? 

  I don't know of one, and I've looked.  I would be interested in what
  you find.  I'm 2/3 through writing a survey of copying gc's (esp.
  genrational ones), with lots of easy-to-understand pictures.
  But it's been on the back burner for months now and may be for months 
  more.  (If anybody needs a treatment of gc's for a book or tutorial,
  give me a call...)
 
>              In particular, I am interested in finding a good garbage
>collecting algorithm to be implemented on a PC (hence no virtual memory),
>which must handle variable-sized cells and do compaction when needed.

  You might consider a copying generational gc, with the oldest generation
  treated specially, using a sliding copy compaction algorithm.  (See
  Cohen & Nicolau's "A Comparison of Compacting Algorithms for Garbage
  Collection.")  By using a sliding compactor, you only need one space
  (rather than a pair of semispaces) in the oldest generation.  On
  the other hand, it costs you more passes over the data.  It's not
  clear to me whether this is a win in VM systems, but it probably
  is in non-VM systems.

  If you want reasonable interactive response and efficiency,
  you should probably use a generational gc -- the space
  the space cost of the youngest generation is generally small
  compared to the efficiency gain.  Andrew Appel had a paper in
  Software Practice and Experience a few months ago, showing just how 
  simple a simple generational gc can be -- it doesn't have to be a
  big deal to be a win.  If you need more efficiency, take a look at
  Ungar & Chambers' recent papers, and my OOPSLA '89 paper.

  Several gc's descended from Ungar's Generation Scavenging
  collector use normal copying most of the time, but sliding compaction
  for full collections (of a one-space oldest generation).

  At OOPSLA, Pat Caudill (of Instantiations, Inc. in Portland) told me
  about an interesting generational gc for non-VM hardware -- maybe he could
  comment on that. (You out there, Pat?)

  If space isn't horribly tight, you may be able to use a pair of
  spaces in the oldest generation without any trouble, and avoid
  needing a complicated one-space algorithm.  (You might be surprised
  how tight space isn't, with a generational gc.  Most programs in
  Lisp & Smalltalk seem to have little live data at a time, but
  create such a huge amount of short-lived data that they keep a
  non-generational gc very busy anyway.  Most Smalltalk users don't
  seem to stress their virtual memory much anyway.  There are exceptions,
  though -- hence the fancier algorithms.)

  By the way, you might get something out of Jacques Cohen's "Garbage
  Collection of Linked Data Structures," Computing Surveys 13, 3 (Sept.,
  1981).  It's getting out of date, now, though.  There's also a survey
  on generational gc's by McEntee (I think) from the TI Tech Journal,
  from a while back, but it's not very detailed.

  My gc will be available at some point, though right now it's kind of
  ugly and overcomplicated to support my research.  It's written in C
  and shouldn't be hard to port once I've streamlined it.

  Lately several groups have been working on conservative mark-sweep
  gc's that can deal with the kind of pointer ambiguities you get
  in languages like C.  They even have generational versions.  People
  doing this include Hans Boehm at Rice, the Portable Common Runtime
  group at Xerox PARC, and Joel Bartlett at DEC WRL.  (You're
  probably better off with a copying gc, though, unless you need the
  special capabilities.)

>------------------------------------------------------------------------------
>Kevin S. Van Horn            | The means determine the ends.
>kevin@argosy.maspar.com      |

  Paul R. Wilson                         
  Software Systems Laboratory               lab ph.: (312) 996-9216
  U. of Illin. at C. EECS Dept. (M/C 154)   wilson@bert.eecs.uic.edu
  Box 4348   Chicago,IL 60680 
Paul R. Wilson                         
Software Systems Laboratory               lab ph.: (312) 996-9216
U. of Illin. at C. EECS Dept. (M/C 154)   wilson@carcoar.stanford.edu
Box 4348   Chicago,IL 60680 

lgm@cbnewsc.ATT.COM (lawrence.g.mayka) (01/23/90)

In article <1990Jan22.111348.5585@Neon.Stanford.EDU> wilson@carcoar.Stanford.EDU (Paul Wilson) writes:
>  Lately several groups have been working on conservative mark-sweep
>  gc's that can deal with the kind of pointer ambiguities you get
>  in languages like C.  They even have generational versions.  People

Just a clarification:

My understanding is that such algorithms do not reclaim any
storage that seems to be "pointed at" by any integer or other
datum that happens to look like a pointer.  Similarly, such a
garbage collector cannot move storage contents in memory (e.g., to
avoid fragmentation) because it cannot tell the difference between
genuine pointers to the moved contents (which would have to be
updated) and false pointers (integers or other data that must be
left alone).


	Lawrence G. Mayka
	AT&T Bell Laboratories
	lgm@ihlpf.att.com

Standard disclaimer.

wilson@carcoar.Stanford.EDU (Paul Wilson) (01/23/90)

In article <1990Jan22.111348.5585@Neon.Stanford.EDU> wilson@carcoar.Stanford.EDU I (Paul Wilson) write:
>
>  Lately several groups have been working on conservative mark-sweep
>  gc's that can deal with the kind of pointer ambiguities you get
>  in languages like C.  They even have generational versions.  People
>  doing this include Hans Boehm at Rice, the Portable Common Runtime
>  group at Xerox PARC, and Joel Bartlett at DEC WRL.  (You're
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Nope.  I was wrong.  He does "mostly copying garbage collection," and
his tech report on it is next on my reading list.  But his system does
deal with ambiguous pointers, and has recently become generational.

>Paul R. Wilson                         
>Software Systems Laboratory               lab ph.: (312) 996-9216
>U. of Illin. at C. EECS Dept. (M/C 154)   wilson@carcoar.stanford.edu
>Box 4348   Chicago,IL 60680 

Paul R. Wilson                         
Software Systems Laboratory               lab ph.: (312) 996-9216
U. of Illin. at C. EECS Dept. (M/C 154)   wilson@carcoar.stanford.edu
Box 4348   Chicago,IL 60680