ttwang@polyslo.CalPoly.EDU (Thomas Wang) (09/07/89)
I have decided to implement a garbage collection system for C++. Currently I am trying to pick the garbage collection algorithm to use. Are there better algorithms than 'generation scavenging'? Does anyone have paper references for the 'best' garbage collection algorithm? Please e-mail or post. Thanks. -Thomas Wang ("This is a fantastic comedy that Ataru and his wife Lum, an invader from space, cause excitement involving their neighbors." - from a badly translated Urusei Yatsura poster) ttwang@polyslo.calpoly.edu
wilson@carcoar.Stanford.EDU (Paul Wilson) (09/07/89)
In article <14236@polyslo.CalPoly.EDU> ttwang@polyslo.CalPoly.EDU (Thomas Wang) writes: >I have decided to implement a garbage collection system for C++. Currently >I am trying to pick the garbage collection algorithm to use. Are there >better algorithms than 'generation scavenging'? Does anyone have paper >references for the 'best' garbage collection algorithm? > It's hard to say what's the "best" gc algorithm, since that varies depending on things like what language you're doing it for. C++ sounds like a major headache because it allows pointers to be converted into integers or stepped past the end of an array, and doesn't distinguish between array pointers and array element pointers. Nasty confusing things to do to a reasonable gc. You probably have to go with something like a conservative mark-sweep gc [Look for an article by Hans-Juergen Boehm et al., in Software Practice and Experience within the last year or two, I think.] This is not going to be the most efficient gc, but I don't know if anything else will work for languages that weren't ever meant to be garbage collected. I believe Xerox' new Portable Common Runtime system for their Cedar system uses something like this. But if you have a reasonable language (like Lisp or Smalltalk or Modula-3) -- I'll be delivering a paper at the upcoming OOPSLA that gives my humble view on the best gc algorithm for stock hardware, namely mine. (:-) Admittedly, my gc incorporates a lot of good ideas from Ungar's Generation Scavenging, Moon's Ephemeral GC, Sobalvarro's version of Moon's gc, and Bob Shaw's version of Generation Scavenging. Eventually it will steal from Caudill & Wirfs-Brock's gc (for Tektronix Smalltalk) as well. (As Ungar & Jackson's OOPSLA 88 paper shows, their (C & W-B's) Large Object Space is a good idea, handling large objects specially so they don't actually have to be copied at a scavenge.) I think you should have more than one generation scavenged online (during user sessions), but I show how to reduce the likelihood of noticeable pauses by careful scheduling. This can also actually increase gc efficiency by scavenging when there's not much work to do (little live data to copy). I agree with Moon that things should be advanced quickly from one generation to the next, generally after no more than two scavenges. I agree with Ungar, though, that one scavenge is *not* enough -- some guarantee of age is worthwhile to prevent premature advancment of very young objects, which will probably die almost immediately and pollute the next older generation. I've simplified and modified Bob Shaw's "bucket brigade" scheme for distinguishing objects' approximate ages by putting them in different subdivisions of a generation. This turns out to allow scavenge thresholds *between* 1 and 2 scavenges, which appears to be about right. Remembered set strategies also have their pros and cons. Mine's different from Ungar's and should work better for ill-behaved programs (lots of writes to intermediate- or long-lifetime data), but his is faster for nicely behaved programs. It's not clear which is best overall, especially since other factors enter into it, such as whether you have a big cache. Oh yeah -- if you just happen to have hardware support for fine-grained incremental garbage collection, Bob Courts' gc is nifty, reorganizing data dynamically to improve locality [CACM, Sept. 88, I think]. Also, the Appel-Ellis-Li stock-hardware gc from SIGPLAN '88 is interesting and beautiful and concurrent. (Watch out for the chunkiness and potential burstiness of its incrementalness, though.) -- Paul Paul R. Wilson Software Systems Laboratory lab ph.: (312) 996-9216 U. of Illin. at C. EECS Dept. (M/C 154) wilson@bert.eecs.uic.edu Box 4348 Chicago,IL 60680 wilson@carcoar.stanford.edu Paul R. Wilson Software Systems Laboratory lab ph.: (312) 996-9216 U. of Illin. at C. EECS Dept. (M/C 154) wilson@bert.eecs.uic.edu Box 4348 Chicago,IL 60680