bet@orion.mc.duke.edu (Bennett Todd) (06/14/89)
In article <77300030@p.cs.uiuc.edu>, johnson@p.cs writes: >[...] > However, the main reason for a garbage >collector in C++ would be to catch "space leaks", so it would not have to >be called very often. I think this observation is crucial to appreciating the gap between those of you who like garbage-collecting environments and those of us who don't (what, me biased?:-). Part of the job I take on in program design and implementation is closely tracking allocation of crucial resources, including memory. I view a memory leak as a bug; when I write something big enough that I am not absolutely positive I didn't miss any potential leaks, I always drive it for a long while with a steady load of work which should settle down to a steady state of memory consumption, then set back and examine its memory use via top(1) (or ps(1) or whatever tool is the least unpleasant to use for the job:-). I'm not sure where the damage was done to my brain; I guess it is probably because I got taught techniques that enable me to write extremely large, complex applications while tracking the memory consumption myself, before I ever got exposed to a garbage-collecting environment. By then I didn't feel any strong need to abandon these techniques, since they don't seem cumbersome to me. In particular I remember fondly a really bizarre package I wrote, more as an exercise in "can I do this?" than for practical application. As I recall, I wanted to write the closed form roots of polynomial expression in C, so I wrote a complex library. I wanted to be able to write composed expressions, and not have the implicit intermediates get dropped on the floor, e.g. result = plus(complex(1.0,0.0),complex(0.0,1.0)); Necessarily the two calls to the constructor complex() will produce two intermediates; I decided to mandate that all arithmetic operators would destroy their arguments. I called this convention "parameter passing by call and destroy" and it always struck me as pleasantly perverse. These days, with structs being treated to call-by-value and properly supported assignment, and struct return generating suitably allocated anonymous copies, the above shenanighans aren't needed any more (and C++ offers, at last, the ability to make a complex type both as convenient to use and as efficient as if it were defined as an arithmetic primitive in the language -- death to Fortran!:-). Nonetheless, it wouldn't appeal to me to create a package which goes about allocating memory, and dropping it on the floor, on the assumption that a garbage collector will pick up the pieces after me. -Bennett bet@orion.mc.duke.edu
bet@orion.mc.duke.edu (Bennett Todd) (06/15/89)
In article <14739@duke.cs.duke.edu>, I wrote: > [...] As I >recall, I wanted to write the closed form roots of polynomial expression >in C, [...] Needless to say, I should have proofread more closely. What I was actually coding up was the closed form roots of *cubic* polynomials. Sorry if I got anyone's hopes up:-). -Bennett bet@orion.mc.duke.edu
jima@hplsla.HP.COM (Jim Adcock) (06/22/89)
I started playing with Boehm's garbage collector yesterday, and found it easy [dare I say "trivial"] to get it running with C++. It took maybe 2 hours for me to get up to speed on this with C++. I think many people will find this an entirely suitable GC for many C++ applications where traditionally one would want to use GC. Clearly, the 2.0 features ark has been talking about in terms of "new" and "delete" overloading are what one wants to be able to hook in various memory management schemes. On my mid-range 68020 system GC took 1-3 seconds on a couple meg of heap with a few thousand "lost" objects floating around. Your mileage may vary. Boehm's GC worked well even when I was deliberately very malicious to it -- filling up my objects with random numbers many of which might very well look like pointers. If you'd like a copy of the couple little files I needed to write to do this, please ask. [These files are not "publication quality", but would provide good hints to get you started.] Boehm's GC clearly isn't the solution for people with strict real-time needs. It would be nice if some C++ specific hooks could be placed in the GC to allow a virtual destructor to be called [where applicable] when such a C++ object is found. Right now when a "lost" object is found, no destructor is called [which is really pretty reasonable, since you lost the object in the first place.] Even if the GC was modified to call the appropriate destructor, you'll still have no guarantee that destructors are called in the order you might like. So destructors with side effects might still be problematic. Still, I think this is about what one can reasonably expect from a GC. As it stands, the GC is fine for traditional situations where one loses track of objects -- lisp-like cells, binary math ops that need to create temporaries, etc. "Obviously," Boehm's collector is exactly what one needs to "solve" [patch] the problem that Johnson mentions where a program runs for a couple days, months, or weeks, and "suddenly" springs a leak. On my system the entire GC archive is about 32K. In a program that just filled objects with random numbers then "lost" track of them, memory allocation, management and garbage collection took about 10% of the total execution time, with the GC's marking routine taking most of the time.
parag@hp-ses.SDE.HP.COM (Parag Patel) (06/22/89)
[drifting away here] > As far as I know, it is possible to provide garbage collection for C, > and some systems have actually done so. The GC has to be "conservative" > in that it must regard anything that might be a pointer as a pointer. > So there might be some garbage it can't collect, but it will never > collect non-garbage. This sort of garbage-collector has an interesting problem handling C++ classes because there's no way to guarantee that the destructor for every class could actually be called. All it could do would be to counteract memory leaks and free unused memory (still useful tho'). Now if it could figure out when a random memory object has a virtual jump table or not, it could at least handle the classes with virtual destructors (I think). Then if you want your object GCed, make your destructor virtual, then link to the libGC++malloc.a instead! Still not quite a general-purpose GC, but it could be useful...
jima@hplsla.HP.COM (Jim Adcock) (06/24/89)
The following simple program makes random length trees -- and fools Boehm's GC. I seem to remember something about this in his README file. Maybe one had better bone up on his algorithm before getting too deep in this. (change either 2 to a 3 (25% chance verses 50% chance) in randomlinks and the GC works fine. Program is not "publication quality.") ----------- #include <stdio.h> #include "gc++.h" extern "C" { int rand(); }; void* ::operator new(long sz) { return gc_malloc((sz < 8) ? 8 : sz); } void ::operator delete(void *p) { if (p) gc_free(p); } class link { public: link* head; link* tail; link() {head = 0; tail = 0;} void randomlinks(); }; long sum; void link::randomlinks() { if (!(rand() & 2)) { head = new link; head->randomlinks(); sum += sizeof(link); } if (!(rand() & 2)) { tail = new link; tail->randomlinks(); sum += sizeof(link); } } main() { link * p[4096]; initialize_allocator(); long cntinc = 4096; int n, q; while (1) { q = 0x0ff & rand(); p[q] = new link; sum += sizeof(link); (p[q])->randomlinks(); if (sum > cntinc) { cntinc <<= 1; printf("\n *** total 'lost' so far: %d ***\n\n",sum); if (sum > 4000000) exit(0); } } }
jima@hplsla.HP.COM (Jim Adcock) (06/29/89)
> The following simple program makes random length trees -- and fools Boehm's > GC. I seem to remember something about this in his README file. Maybe one > had better bone up on his algorithm before getting too deep in this. > > (change either 2 to a 3 (25% chance verses 50% chance) in randomlinks and > the GC works fine. Program is not "publication quality.") Oops, sorry, as Boehm pointed out to me, the problem is apparently not with Boehm's GC, but in my "logic" in analysing the expected behavior of my "simple" program. I didn't check how much these things were actually growing. I thought about a single leaf, and said to myself that with each child having a 50% probability of not existing, there is a 25% probability of any given leaf not having any children, therefor the tree "must" terminate. But when I think of it another way, each leaf has an "expected" amount of 2*0.5 = 1.0 children, (2*0.5)*(2*0.5) = 4*0.25 = 1.0 grandchildren, (2*0.5)*(2*0.5)*(2*0.5) = 8*0.125 = 1.0 great-grand-children.... so the size of a tree would be marginally ill-conditioned, and could grow "indefinitely." Changing a 2 to a 3 makes for "expected" 0.75 children, 0.5625 grand-children, etc, so then one doesn't have indeterminate growth of the trees. Now, if Boehm's GC *could* handle my program, then he'd *really* have something! :-)
drich@klaatu.lanl.gov (David O. Rich) (05/08/91)
> For a discussion of the "mostly copying" garbage collection algorithm, see > "Compacting Garbage Collection with Ambiguous Roots", WRL Research Report > 88/2, February 1988, and "Mostly-Copying Garbage Collection Picks Up > Generations and C++", WRL Technical Note TN-12, October 1989. Are these TR's FTP-able from somewhere? Thanks! --Dave
bartlett@decwrl.dec.com (Joel Bartlett) (05/08/91)
Both tech reports are available via DECWRL's tech report server. Send a message to WRL-Techreports@decwrl.dec.com with "help" in the subject line and it will return you instructions. jfb