kevin@argosy.UUCP (Kevin S. Van Horn) (01/20/90)
Can someone recommend a book that gives a good treatment of garbage collection algorithms? In particular, I am interested in finding a good garbage collecting algorithm to be implemented on a PC (hence no virtual memory), which must handle variable-sized cells and do compaction when needed. ------------------------------------------------------------------------------ Kevin S. Van Horn | The means determine the ends. kevin@argosy.maspar.com |
wilson@carcoar.Stanford.EDU (Paul Wilson) (01/22/90)
In article <366@argosy.UUCP> kevin@argosy.maspar.com (Kevin S. Van Horn) writes: >Can someone recommend a book that gives a good treatment of garbage collection >algorithms? I don't know of one, and I've looked. I would be interested in what you find. I'm 2/3 through writing a survey of copying gc's (esp. genrational ones), with lots of easy-to-understand pictures. But it's been on the back burner for months now and may be for months more. (If anybody needs a treatment of gc's for a book or tutorial, give me a call...) > In particular, I am interested in finding a good garbage >collecting algorithm to be implemented on a PC (hence no virtual memory), >which must handle variable-sized cells and do compaction when needed. You might consider a copying generational gc, with the oldest generation treated specially, using a sliding copy compaction algorithm. (See Cohen & Nicolau's "A Comparison of Compacting Algorithms for Garbage Collection.") By using a sliding compactor, you only need one space (rather than a pair of semispaces) in the oldest generation. On the other hand, it costs you more passes over the data. It's not clear to me whether this is a win in VM systems, but it probably is in non-VM systems. If you want reasonable interactive response and efficiency, you should probably use a generational gc -- the space the space cost of the youngest generation is generally small compared to the efficiency gain. Andrew Appel had a paper in Software Practice and Experience a few months ago, showing just how simple a simple generational gc can be -- it doesn't have to be a big deal to be a win. If you need more efficiency, take a look at Ungar & Chambers' recent papers, and my OOPSLA '89 paper. Several gc's descended from Ungar's Generation Scavenging collector use normal copying most of the time, but sliding compaction for full collections (of a one-space oldest generation). At OOPSLA, Pat Caudill (of Instantiations, Inc. in Portland) told me about an interesting generational gc for non-VM hardware -- maybe he could comment on that. (You out there, Pat?) If space isn't horribly tight, you may be able to use a pair of spaces in the oldest generation without any trouble, and avoid needing a complicated one-space algorithm. (You might be surprised how tight space isn't, with a generational gc. Most programs in Lisp & Smalltalk seem to have little live data at a time, but create such a huge amount of short-lived data that they keep a non-generational gc very busy anyway. Most Smalltalk users don't seem to stress their virtual memory much anyway. There are exceptions, though -- hence the fancier algorithms.) By the way, you might get something out of Jacques Cohen's "Garbage Collection of Linked Data Structures," Computing Surveys 13, 3 (Sept., 1981). It's getting out of date, now, though. There's also a survey on generational gc's by McEntee (I think) from the TI Tech Journal, from a while back, but it's not very detailed. My gc will be available at some point, though right now it's kind of ugly and overcomplicated to support my research. It's written in C and shouldn't be hard to port once I've streamlined it. Lately several groups have been working on conservative mark-sweep gc's that can deal with the kind of pointer ambiguities you get in languages like C. They even have generational versions. People doing this include Hans Boehm at Rice, the Portable Common Runtime group at Xerox PARC, and Joel Bartlett at DEC WRL. (You're probably better off with a copying gc, though, unless you need the special capabilities.) >------------------------------------------------------------------------------ >Kevin S. Van Horn | The means determine the ends. >kevin@argosy.maspar.com | Paul R. Wilson Software Systems Laboratory lab ph.: (312) 996-9216 U. of Illin. at C. EECS Dept. (M/C 154) wilson@bert.eecs.uic.edu Box 4348 Chicago,IL 60680 Paul R. Wilson Software Systems Laboratory lab ph.: (312) 996-9216 U. of Illin. at C. EECS Dept. (M/C 154) wilson@carcoar.stanford.edu Box 4348 Chicago,IL 60680
bds@mbunix.mitre.org (Smith) (01/23/90)
I have a paper written by Bob Courts of TI called "Improving Locality of Reference in a Garbage-Collecting Memory Management System" that describes how TI implemented gc on the Explorer. I found it to be an interesting paper, and well worth reading. It was given to me by a TI salesman, and I don't know if it was ever published. There is no email address given, but his mail address is simply: Bob Courts Texas Instruments Incorporated Data Systems Group Austin, Texas I'd offer to make copies of the paper, but even though it was given to me with no strings attached, and there are no copywright notices, I'm not sure if making random copies is acceptable. Barry Smith
gadbois@sygmund.cgs.utexas.edu (David Gadbois) (01/23/90)
From: kevin@argosy.UUCP (Kevin S. Van Horn) Date: 20 Jan 90 07:49:37 GMT Can someone recommend a book that gives a good treatment of garbage collection algorithms? In particular, I am interested in finding a good garbage collecting algorithm to be implemented on a PC (hence no virtual memory), which must handle variable-sized cells and do compaction when needed. Here is some of the garbage from my bibliography files. The first paper, while a little out of date, is an excellent general survey of garbage collection techniques. The rest are more theory-oriented but may have some useful info. --David Gadbois @ARTICLE{cohen:garbage, AUTHOR = "Jacques Cohen", TITLE = "Garbage Collection of Linked Data Structures", JOURNAL = "ACM Computing Surveys", YEAR = 1981, VOLUME = 13, NUMBER = 3, PAGES = "342--367", MONTH = "September" } @INPROCEEDINGS{moon:garbage, AUTHOR = "David A. Moon", TITLE = "Garbage collection in a large lisp system", BOOKTITLE = "Proceedings of the 1984 ACM Symposium on Lisp and Functional Programming", PAGES = "235--246", YEAR = 1984, ORGANIZATION = "ACM", MONTH = "August" } @INPROCEEDINGS{rudalics:garbage, AUTHOR = "Martin Rudalics", TITLE = "Distributed Copying Garbage Collection", BOOKTITLE = "Proceedings of the 1986 ACM Symposium on Lisp and Functional Programming", YEAR = 1986, PAGES = "364--372" } @ARTICLE{dijkstra:garbage, AUTHOR = "Edsger W. Dijkstra and Leslie Lamport and A. J. Martin and C. S. Scholten and E. F. M.Steffens", TITLE = "On-the-Fly Garbage Collection: An Exercise in Cooperation", JOURNAL = "Communications of the ACM", YEAR = 1978, VOLUME = 21, NUMBER = 11, PAGES = "966--975", MONTH = "November" } @INPROCEEDINGS{lamport:garbage, AUTHOR = "Leslie Lamport", TITLE = "Garbage Collection With Multiple Processes: An Exercise in Parallelism", BOOKTITLE = "Proceedings of the 1976 International Conference on Parallel Processin", YEAR = 1976, PAGES = "50--54" } @ARTICLE{steele:garbage, AUTHOR = "Guy L. {Steele Jr.}", TITLE = "Multiprocessing Compactifying Garbage Collection", JOURNAL = "Communications of the ACM", YEAR = 1975, VOLUME = 18, NUMBER = 9, PAGES = "495--508", MONTH = "September" } @ARTICLE{courts:garbage, AUTHOR = "Robert Courts", TITLE = "Improving Locality of Reference in a Garbage-Collecting Memory Management System", JOURNAL = "Communications of the ACM", YEAR = 1988, VOLUME = 31, NUMBER = 9, PAGES = "1128--1138", MONTH = "September" } @TECHREPORT{shaw:garbage, AUTHOR = "R. A. Shaw", TITLE = "Improving Garbage Collector Performance in Virtual Memory", INSTITUTION = "Stanford University", YEAR = 1987, NUMBER = "CSL-TR-87-323", MONTH = "March" }
lgm@cbnewsc.ATT.COM (lawrence.g.mayka) (01/23/90)
In article <1990Jan22.111348.5585@Neon.Stanford.EDU> wilson@carcoar.Stanford.EDU (Paul Wilson) writes: > Lately several groups have been working on conservative mark-sweep > gc's that can deal with the kind of pointer ambiguities you get > in languages like C. They even have generational versions. People Just a clarification: My understanding is that such algorithms do not reclaim any storage that seems to be "pointed at" by any integer or other datum that happens to look like a pointer. Similarly, such a garbage collector cannot move storage contents in memory (e.g., to avoid fragmentation) because it cannot tell the difference between genuine pointers to the moved contents (which would have to be updated) and false pointers (integers or other data that must be left alone). Lawrence G. Mayka AT&T Bell Laboratories lgm@ihlpf.att.com Standard disclaimer.
wilson@carcoar.Stanford.EDU (Paul Wilson) (01/23/90)
In article <1990Jan22.111348.5585@Neon.Stanford.EDU> wilson@carcoar.Stanford.EDU I (Paul Wilson) write: > > Lately several groups have been working on conservative mark-sweep > gc's that can deal with the kind of pointer ambiguities you get > in languages like C. They even have generational versions. People > doing this include Hans Boehm at Rice, the Portable Common Runtime > group at Xerox PARC, and Joel Bartlett at DEC WRL. (You're ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Nope. I was wrong. He does "mostly copying garbage collection," and his tech report on it is next on my reading list. But his system does deal with ambiguous pointers, and has recently become generational. >Paul R. Wilson >Software Systems Laboratory lab ph.: (312) 996-9216 >U. of Illin. at C. EECS Dept. (M/C 154) wilson@carcoar.stanford.edu >Box 4348 Chicago,IL 60680 Paul R. Wilson Software Systems Laboratory lab ph.: (312) 996-9216 U. of Illin. at C. EECS Dept. (M/C 154) wilson@carcoar.stanford.edu Box 4348 Chicago,IL 60680
djohnson@m2.csc.ti.com (Doug Johnson) (02/02/90)
In article <BDS.90Jan22123550@mbunix.mitre.org> bds@mbunix.mitre.org (Smith) writes: > >I have a paper written by Bob Courts of TI called "Improving Locality >of Reference in a Garbage-Collecting Memory Management System" that >describes how TI implemented gc on the Explorer. I found it to be an >interesting paper, and well worth reading. It was given to me by a TI >salesman, and I don't know if it was ever published. It was published in Communications of the ACM, September 1988. While there is some discussion of GC in general, it focuses on GC in a virtual memory environment. I believe the tradeoffs are very different in a physical memory system. Paul Wilson's suggestions are close to what I'd try. -- Doug
wilson@carcoar.Stanford.EDU (Paul Wilson) (02/02/90)
In article <108553@ti-csl.csc.ti.com> djohnson@m2.csc.ti.com (Doug Johnson) writes: >In article <BDS.90Jan22123550@mbunix.mitre.org> bds@mbunix.mitre.org (Smith) writes: >> >>I have a paper written by Bob Courts of TI called "Improving Locality >>of Reference in a Garbage-Collecting Memory Management System" that >>describes how TI implemented gc on the Explorer. I found it to be an >>interesting paper, and well worth reading. It was given to me by a TI >>salesman, and I don't know if it was ever published. > >It was published in Communications of the ACM, September 1988. While >there is some discussion of GC in general, it focuses on GC in a >virtual memory environment. I believe the tradeoffs are very >different in a physical memory system. Paul Wilson's suggestions are >close to what I'd try. > >-- Doug In article <108553@ti-csl.csc.ti.com> djohnson@m2.csc.ti.com (Doug Johnson) writes: >In article <BDS.90Jan22123550@mbunix.mitre.org> bds@mbunix.mitre.org (Smith) writes: >> >>I have a paper written by Bob Courts of TI called "Improving Locality >>of Reference in a Garbage-Collecting Memory Management System" that >>describes how TI implemented gc on the Explorer. > [...] >It was published in Communications of the ACM, September 1988. While >there is some discussion of GC in general, it focuses on GC in a >virtual memory environment. I believe the tradeoffs are very >different in a physical memory system. Paul Wilson's suggestions are >close to what I'd try. >-- Doug If you're interested in neat things to do to improve locality in interesting hierarchical memories (as opposed to a pc), there are actually several similar things worth investigating -- all of them apparently derived independently by various folks. Courts' system is actually strikingly similar to the system proposed in Jon L White's paper for the 1980 Lisp Conference, "Address/Memory Management for a Gigantic LISP Environment". The basic idea of both is to use Baker-style incremental copying to improve locality of reference when garbage collection isn't sufficient to do the job. In the case of White's proposed system, that's because there's no garbage collection at all(!) most of the time. In Courts' system, the same technique is used in a generational garbage collector. This also bears a strong resemblance to an idea proposed in (and simulated) in a paper by Jean-Loup Baer and Gary R. Sager ("Dynamic Improvement of Locality in Virtual Memory Systems", IEEE TSWE, March 1976), way back in the mid-70's. Their idea is that the unit of virtual memory mapping ("minipages") be smaller than the unit of disk transfer, and that minipages be grouped according to dynamic reference patterns within disk pages. This effects a dynamic reorganization of minipages within disk pages. The four least-recently-used minipages are grouped together and ejected on a miss, and when a page is brought in, so are the other 3 minpages on its disk page. The result is to effectively prefetch related minipages on a page miss. Unfortunately, Baer & Sager's technique failed to give an improvement in performance, but I think that's because they didn't give it a fair test. I believe the failure of their system to do the trick was because of the particular characteristics of computers in 1975. If you tried it now, on a modern computer, it would probably work pretty well. It's my belief that one of the important prerequisites for effective prepaging is simply that you have a lot of memory pages. That is, if devoting a page to prefetching sacrifices more than about 1% of your memory, then you lose -- the increased misses caused by the smaller memory will not be compensated by successful prefetches. On the other hand, if you have more than about 100 pages, devoting one to holding prefetches is worthwhile -- it doesn't cost you many misses, and it will prefetch a useful page often enough. So it seems that if you have more than about 100 units of memory (pages in your main memory, or cache blocks in your cache), then prefetching is worth it. If you have significantly fewer pages/blocks, you're better off devoting them all to normal caching. If you have significantly more, you're better off spending a few on prefetching. Simple, no? Am I right? I don't know. I hope to find out soon. But it is very interesting to note that computers these days have many more units (pages/blocks) at a level of storage than they did in 1975. This is necessitated by the much higher ratio of CPU speeds to RAM speeds, and RAM speeds to disk speeds -- computers these days *must* have big memories and caches, and higher hit rates, or they just won't cut it. So the marginal cost of using a block for prefetching is smaller now, while the relative benefit is about the same. This says to me that prefetching is an increasingly good idea, every day. Anyway, to get back to our story, Baer & Sager's technique appears to have failed because they had to effectively reserve 3 "minipages" for prefetching to be able to reorganize 4 minipages at a time within the disk-transfer pages. (They maybe also didn't reorganize them the way they should have, though in fact there may be a better order that's much cheaper to implement.) Since 3 minipages was a couple of percent of even the largest memory they tried, it was a lose. It turns out that people at the U. of Manchester have come up with a simliar system, reorganizing cache blocks within disk pages. This is in a hardware-supported object-oriented memory for Smalltalk. The effects seem similar to those of Courts' reorganization during gc's. (see "Dynamic Grouping in an Object-Oriented Virtual Memory Hierarchy," Proc ECOOP '87, pp. 87-96, and "Realisation of a Dynamically Grouped Object-Oriented Virtual Memory Hierarchy," I.W. Williams, M.I. Wolczko, and T.W. Hopkins, in Proc. of the Workshop on Persistent Object Systems: Their Design, Implementation, and Use, pp 298-308, August 1987. Persistent Programming Research Report, Edinburgh University, Dept. of Computer Science (PPRR-44-87). Naturally, I found out about this because I reinvented the same wheel and eventually somebody told me so :-) -- Paul Paul R. Wilson Software Systems Laboratory lab ph.: (312) 996-9216 U. of Illin. at C. EECS Dept. (M/C 154) wilson@carcoar.stanford.edu Box 4348 Chicago,IL 60680 Paul R. Wilson Software Systems Laboratory lab ph.: (312) 996-9216 U. of Illin. at C. EECS Dept. (M/C 154) wilson@carcoar.stanford.edu Box 4348 Chicago,IL 60680