sommar@enea.se (Erland Sommarskog) (03/15/88)
Edmund E Howland (cbseeh@fang.ATT.COM) writes: >Huh? >Since when have these three ever had a need for garbage collection? >I am not too familiar with Ada, but garbage collection exists in >languages for whom dynamic binding is a way of life, not an option. If I define and implement a package for a data type of some sort in Ada I can hide the implementation of the type, just providing a set of operations. At a later state I can change the implementation without affecting the interface. However, I must already in the first step, decide whether objects of this type should be dynamic or not. Not so much for a allocation routine, you quite often need an initiation routine, anyway. But do I need a deallocation routine? If there is garbage collection, I don't have to think of this problem. Yes, I can one there just in case, but then the calling units have to bother about using them, which may require a lot of clean-up code, which might be totally unnecessary. Generally, garbage collection is a must in object-oriented programming. (No, I don't belive Ada is a true OO-language.) -- Erland Sommarskog ENEA Data, Stockholm sommar@enea.UUCP "Si tu crois l'amour tabou... Regarde bien, les yeux d'un fou!!!" -- Ange
g-rh@cca.CCA.COM (Richard Harter) (03/31/88)
In article <628@ra.rice.edu> boehm@titan.rice.edu (Hans Boehm) writes: > My experience is that any attempt at manipulation of interesting linked >structures in a language that doesn't support real automatic storage >management will either fail, or waste large amounts of debugging time. >(My experience includes a (working) 40,000 line compiler, written in C, that >manipulates a reference counted syntax "tree", or more accurately, a >reference counted syntax graph.) Normally, it is extremely difficult >to track down bugs created by premature deallocation. When such bugs are >finally removed, the resulting programs normally include a substantial >number of storage leaks. > Some recent experiments by Mark Weiser and myself with retrofitting a >garbage collector to existing C programs, verify the latter point. >(The garbage collector should never free anything since that was the >programmers responsibility. It does. In other people's code. >Our paper on this should appear in Software P&E shortly.) >Mike Caplinger reported similar results for another application at the last >USENIX conference, I believe. We have resurrected C code with storage >management bugs by linking it against a garbage collector (which in the case >of C doesn't always work either, but it has a better track record than manual >storage management). There is problably something I am missing, but I don't quite see how you can implement garbage collection in C without collateral assumptions -- how is the garbage collector supposed to know that that a block is free? Your main point is certainly true - manual allocation and deallocation with the programmer being responsible for ensuring that all comes out right really doesn't work very well. This was a critical issue for us -- our software has to run on systems which do not have pseudo-infinite memory and it sometimes has to run for very long runs. We can't afford memory leaks (or premature deallocation). What we did was to write our own storage allocation package. Salient features: The allocation routine takes two arguments, a size and an ID. The latter is a unique integer specifying that particular call. (We manage the ID list manually.) The allocator creates a node for each request block. In this node are sundry links, the ID, and (in effect) a date stamp. There is also a hash table that contains the address of every allocated block. All allocator control information is stored in a separate space from allocated memory itself. (This eliminates a large class of mystery bugs associated with overwriting allocator control data.) The deallocation routine takes one argument, the address of the block being returned. The deallocation routine validates the address as being an allocated address. This eliminates another class of bugs caused by passing an improper or stale address to the deallocation routine. There is a facility for printing out a complete map of allocated memory including call ID's and date stamps. We use this, from time to time, to check the code for storage leaks. This scheme is not as good as automatic garbage collection, but we have found it to be satisfactory. -- In the fields of Hell where the grass grows high Are the graves of dreams allowed to die. Richard Harter, SMDS Inc.
boehm@titan.rice.edu (Hans Boehm) (04/01/88)
In reference to g-rh@cca.CCA.COM (Richard Harter)'s question: > There is problably something I am missing, but I don't quite see >how you can implement garbage collection in C without collateral assumptions >-- how is the garbage collector supposed to know that that a block is free? This is usually, but not always, possible. The idea is to use a traditional non-compacting mark-sweep collector. On a UN*X system, we start by scanning the registers, stack, data, and (statically allocated) bss segments for any addresses of valid allocated objects. We subsequently scan allocated objects we find in the same way. For a typical C program, all reachable objects can be found in this manner. (Storing exclusive or'ed pointer values, tagged pointers, and similiar programming tricks, as well as some rarely used (for C) compiler optimization techniques may break this scheme.) In theory, we may end up marking a few unreachable objects as reachable as well, since integers may be misinterpreted as pointers. The collector can be designed so that the only result of this is excess storage retention. (This does require that we don't compact.) In my experience, I have never seen any indication of such unnecessary retention in practice. There is an argument to be made that no garbage collector is completely immune from this phenomenon. The fact that the collector has to perform a fairly complicated (but still O(1)) check for pointer validity does slow it down a little. On a Sun 3/260, with a 2 Meg heap, half of which is in use, we normally see collection times of about 3 seconds. (This assumes longword aligned pointers and small data and static bss areas.) All of this is largely irrelevant if the collector is used as a debugging tool to detect storage leaks. More details can be found in Mark Weiser's and my upcoming Software P&E article entitled "Garbage Collection in an Uncooperative Environment". Hans-J. Boehm boehm@rice.edu
g-rh@cca.CCA.COM (Richard Harter) (04/01/88)
In article <632@ra.rice.edu> boehm@titan.rice.edu (Hans Boehm) writes: >> There is problably something I am missing, but I don't quite see >>how you can implement garbage collection in C without collateral assumptions >>-- how is the garbage collector supposed to know that that a block is free? > This is usually, but not always, possible. The idea is to use a traditional >non-compacting mark-sweep collector. On a UN*X system, we start by scanning >the registers, stack, data, and (statically allocated) bss segments for >any addresses of valid allocated objects. We subsequently scan allocated >objects we find in the same way. For a typical C program, all reachable >objects can be found in this manner. (Storing exclusive or'ed pointer >values, tagged pointers, and similiar programming tricks, as well as some >rarely used (for C) compiler optimization techniques may break this scheme.) I see. My block was in not thinking of the entire address space of the program as accessible. For our purposes it is not (we live in portability land and are not allowed to do things like that :-)). That is one reason we did our own allocator (in C) -- it allows us to track down memory problems portably. I still don't see how a garbage collector helps with premature deallocation, unless you scrap deallocation entirely, and rely entirely on garbage collection. The problem with garbage collectors in the old days was that they did garbage collection when the available space was allocated. This works well enough until the actual amount of space allocated starts to approach the amount of available space, and then the garbage collector starts thrashing. Is this a problem? If not, how do you deal with it? -- In the fields of Hell where the grass grows high Are the graves of dreams allowed to die. Richard Harter, SMDS Inc.
boehm@titan.rice.edu (Hans Boehm) (04/02/88)
Richard Harter writes: >I still don't see how a garbage collector helps with premature deallocation, >unless you scrap deallocation entirely, and rely entirely on garbage >collection. >The problem with garbage collectors in the old days was that they did >garbage collection when the available space was allocated. This works >well enough until the actual amount of space allocated starts to approach >the amount of available space, and then the garbage collector starts >thrashing. Is this a problem? If not, how do you deal with it? Our current version of the collector doesn't help you with premature deallocation. Further instrumenting it might help. For example, you could use the marking algorithm to determine whether "free"d objects really were inaccessible. (This requires a coding style in which unused references are explicitly cleared. Replacing "free" by a macro that clears its argument takes care of a lot of that and, by itself, occasionally helps track down bugs.) I don't have much experience with this though. In non-real-time applications, we tend to use explicit deallocation where it's easy to do, and let the collector take care of the error-prone cases. You are right in that the garbage collector code is not completely portable. It should really be viewed as part of the run-time library, which is rarely completely portable anyway. Actually porting it isn't terribly hard. There is only about a screenful of machine dependent code in the collector. The longest part of that deals with marking from the machine registers. Frequent collections in small address spaces can become a problem. We operate in a virtual memory environment, so we deal with it by expanding the heap size whenever a collection fails to return enough memory. This requires allocating a bit of extra memory. For debugging, this hardly matters. (Memory is supposedly cheap.) In a production environment, especially without virtual memory, it could be a problem. On the other hand, for some existing code, this can be much less significant than memory otherwise lost to leaks. Hans-J. Boehm (boehm@rice.edu)
nevin1@ihlpf.ATT.COM (00704a-Liber) (04/05/88)
[followups to comp.lang.misc] In article <26358@cca.CCA.COM> g-rh@CCA.CCA.COM.UUCP (Richard Harter) writes: > There is problably something I am missing, but I don't quite see >how you can implement garbage collection in C without collateral assumptions >-- how is the garbage collector supposed to know that that a block is free? The only way to do this is to allocate a giant heap with malloc() and do *all* the memory management on your own. It is no different than implementing in C or C++ a language that does automatic garbage collection. The problem I found with languages that do automatic garbage collection is that there is no way for me to modify the implementation of the class (or the type, for those of you unfamiliar with C++ lingo :-)) to make it better suit my needs. For applications where the programmer cost is greater than the run-time cost (such as prototypes, personal tools, one-shot programs), I tend to use languages with built in memory management. For other applications, however, I would rather use a language like C++ which allows me to change the implementation of a class without affecting the rest of my code. -- _ __ NEVIN J. LIBER ..!ihnp4!ihlpf!nevin1 (312) 510-6194 ' ) ) "The secret compartment of my ring I fill / / _ , __o ____ with an Underdog super-energy pill." / (_</_\/ <__/ / <_ These are solely MY opinions, not AT&T's, blah blah blah
ifocs9d@aucs.uucp (Rick Giles) (08/01/90)
I'm investigating memory management and garbage collection schemes for Prograph, an object-oriented, data-flow programming language. The current implementation of Prograph uses reference counting for garbage collection. We're considering the pros and cons of using mark and sweep or generational garbage collection algorithms. If you have some first-hand recommendations or information I'd be pleased to hear from you. I can give further info on Prograph and the memory management issues of the language if that will help. Thanks. Rick Giles Bitnet: FRGILES@Acadia.ca Internet: FRGILES%Acadia.BITNET@CUNYVM.CUNY.EDU UUCP: uunet!dalcs!aucs!ifocs9d