ttwang@polyslo.CalPoly.EDU (Thomas Wang) (09/30/89)
There is a feature or mis-feature in C++ that makes building a copying
garbage collector difficult. Copying garbage collector have two advantages
over non-copying garbage collector: (1) Copying garbage collector compacts
the heap automatically. (2) Frequently used objects can be copied together
for better locality of reference.
Copying garbage collector moves an object from one space to another space.
Moving and copying are different, because when you move an object to a new
location, the object at the old location is discarded.
Say that I am in the middle of a member function of object 'obj'. And the
collector has decided to move 'obj' to a new location. The problem is that
the value of 'this' in the member function will still point to the old
location.
I propose that C++ be improved, so if one writes:
volatile class foo
{ ... };
The definition of au1_this be changed from foo* to foo**.
All the usages of au1_this would be (*au1_this) -> instead of au1_this ->.
This way, when I update the value of the master pointer to the new location,
the value of 'this' will automatically be correct.
-Thomas Wang (Ah so desu ka!)
ttwang@polyslo.calpoly.edu
jima@hplsla.HP.COM (Jim Adcock) (10/03/89)
>/ hplsla:comp.lang.c++ / ttwang@polyslo.CalPoly.EDU (Thomas Wang) / 10:15 pm Sep 29, 1989 / >There is a feature or mis-feature in C++ that makes building a copying >garbage collector difficult. Copying garbage collector have two advantages >over non-copying garbage collector: (1) Copying garbage collector compacts >the heap automatically. (2) Frequently used objects can be copied together >for better locality of reference. > >Copying garbage collector moves an object from one space to another space. >Moving and copying are different, because when you move an object to a new >location, the object at the old location is discarded. > >Say that I am in the middle of a member function of object 'obj'. And the >collector has decided to move 'obj' to a new location. The problem is that >the value of 'this' in the member function will still point to the old >location. > >I propose that C++ be improved, so if one writes: > >volatile class foo >{ ... }; > >The definition of au1_this be changed from foo* to foo**. >All the usages of au1_this would be (*au1_this) -> instead of au1_this ->. > >This way, when I update the value of the master pointer to the new location, >the value of 'this' will automatically be correct. > > -Thomas Wang (Ah so desu ka!) > Essentially, this just strikes me as another call for using handles to objects, rather than just objects themselves. If one believes that garbage collection is a central aspect of OOP, and that copying garbage collectors are the way to go, then one uses handles, and gets a language that looks like one of the many sons and daughters of Smalltalk. If one does not believe that GC is central to OOP, and that one is better off to try to avoid GC as much as possible, and that when one does GC one need not do copy collection -- then one does not use handles, and can make good use of the efficiency of C++ -- not requiring multiple indirection to do simple things. I don't believe copying and compacting are necessary in C++ where GC is typically a secondary concern. Look again at Boehm's GC system and/or buddy systems.
ttwang@polyslo.CalPoly.EDU (Thomas Wang) (10/03/89)
jima@hplsla.HP.COM (Jim Adcock) writes: >>/ hplsla:comp.lang.c++ / ttwang@polyslo.CalPoly.EDU (Thomas Wang) / 10:15 pm Sep 29, 1989 / >>There is a feature or mis-feature in C++ that makes building a copying >>garbage collector difficult. Copying garbage collector have two advantages >>over non-copying garbage collector: (1) Copying garbage collector compacts >>the heap automatically. (2) Frequently used objects can be copied together >>for better locality of reference. >Essentially, this just strikes me as another call for using handles to objects, >rather than just objects themselves. If one believes that garbage collection >is a central aspect of OOP, and that copying garbage collectors are the >way to go, then one uses handles, and gets a language that looks like one >of the many sons and daughters of Smalltalk. The addition of 'volatile classes' would not cause C++ to become like Smalltalk, since the user can use ordinary classes where it's appropriate. >If one does not believe that GC is central to OOP, and that one is better >off to try to avoid GC as much as possible, and that when one does GC >one need not do copy collection -- then one does not use handles, and can make >good use of the efficiency of C++ -- not requiring multiple indirection to >do simple things. It's just one extra indirection, which comes to about a few CPU cycles. If C++ is just used as a better C, then GC is not required. But if one want to use C++ to build classes which can be re-used by different modules, then GC and exception handling is required. For example: one normally passes a pointer to insert an object into a list. When the list is deleted the list class itself does not know if it's safe to destroy the pointers it contains. This is a quite common problem. Beside copying garbage collection will let the program run faster. Reduced paging activity will easily offset the extra pointer indirection. Heap fragmentation should not be treated lightly either. Let the system contains N bytes of free space, and S small objects in memory. For a totally fragmented system, the largest object size that can be allocated is (N - S * small_object_size) / (S+1) The largest object size that can be allocated in a heap compacted system is (N - S * small_object_size) >I don't believe copying and compacting are necessary in C++ where GC is >typically a secondary concern. Look again at Boehm's GC system and/or >buddy systems. Like I said, copying garbage collection will be an improvment. In the main while, non-copying garbage collection will have to do. -Thomas Wang (Ranma no baka! - Ranma 1/2 ) ttwang@polyslo.calpoly.edu
jima@hplsla.HP.COM (Jim Adcock) (10/04/89)
> Heap fragmentation should not be treated lightly either. Let the system > contains N bytes of free space, and S small objects in memory. For a totally > fragmented system, the largest object size that can be allocated is > (N - S * small_object_size) / (S+1) > The largest object size that can be allocated in a heap compacted system is > (N - S * small_object_size) This was gone over before too. The goal of garbage collection on a modern machine with virtual memory is not really to keep heap within a fixed sized arena, but rather to slow the growth of heap used. As Boehm shows, GCs can't recover all memory lost in any case. If one does not force an artificial upper bound on heap, one finds that fragmented systems tend use about 50% of the memory "occupied", and waste the other half. I'm not sure this is really all that bad.
jima@hplsla.HP.COM (Jim Adcock) (10/04/89)
//Give an example of the code a compiler would generate for the following? //[valid under 2.0 except for the volatile keyword] #include <stdio.h> class A { public: virtual void doSomething(); }; void A::doSomething() {printf("I'm not volatile, don't give me an extra indirection\n");} volatile class B : public A { public: virtual void doSomething(); }; void B::doSomething() {printf("I'm volatile, give me an extra indirection\n");} extern "C" {long lrand48();}; main() { A a; B b; A* p; for (int i=0; i<20; ++i) { p = (lrand48() & 1) ? &a : &b; p->doSomething(); } }
ttwang@polyslo.CalPoly.EDU (Thomas Wang) (10/04/89)
I have come up with what I think is a better alternative than 'volatile classes'. Instead of using master pointers, each member function is required to declare a local variable of 'watcher_t'. The 'watcher_t' class keep track of the address of 'this' on the stack. When the objects are moved, a post watcher list traversal will make all the values of 'this' point to the right location. This means that I can use modern GC methods with real copying. #ifndef WATCHER_T #define WATCHER_T // File: watcher_t.h // Author: Thomas Wang // Purpose: watch the value of 'this' on the stack & fix them when necessary // Date: Oct 3, 1989 // Usage: put watcher_t watcher(&this); at start of every member function class watcher_t { friend class gc_root_t; int* this_location; watcher_t* prev_link; watcher_t* next_link; void update_this(); // make 'this' point to the right location public: watcher_t(void*); // constructor telling the address of 'this' for gc objects ~watcher_t() { prev_link->next_link = next_link; next_link->prev_link = prev_link; } }; #endif // File: watcher_t.cc // Author: Thomas Wang // Purpose: watch the value of 'this' on the stack & fix them when necessary // Date: Oct 3, 1989 // Usage: put watcher_t watcher(&this); at start of every member function #include "watcher_t.h" // make 'this' point to the right location void watcher_t::update_this() { static int* forward_ptr; forward_ptr = this_location - 1; if (*forward_ptr) *this_location = *forward_ptr; // fix the value of 'this' } // constructor telling the address of 'this' for gc objects watcher_t::watcher_t(void* val) { this_location = (int*) val; the_root.insert_watcher(this); } -Thomas Wang ("This is a fantastic comedy that Ataru and his wife Lum, an invader from space, cause excitement involving their neighbors." - from a badly translated Urusei Yatsura poster) ttwang@polyslo.calpoly.edu
schwartz@dinl.uucp (Michael Schwartz) (10/05/89)
I may be missing something in this discussion, but isn't this exactly the problem that Glockenspeil's (public domain) Containers abstraction is supposed to solve? One sets the file allocater to use whatever method one wishes (default is the stack (not movable), and doesn't worry about it after that. It is the 'lock object's job to lock the object down to a known address when accessing it, otherwise the object can float freely. Correctly, according to their documentation, this takes a long while to get used to (about 5 solid nights of work for me). The documentation is a bit lean. However, it does work nicely. Please don't flame me if I missed the point of discussion. -- ----------------------- schwartz%looney@mmc.com "Expect everything ... mschwartz@mmc.com and the unexpected never happens." ncar!dinl!schwartz --the phantom tollbooth DISCLAIMER: The opinions expressesed are not necessarily those of my employer or myself.
jima@hplsla.HP.COM (Jim Adcock) (10/06/89)
> schwartz: > I may be missing something in this discussion, but isn't this exactly > the problem that Glockenspeil's (public domain) Containers abstraction > is supposed to solve? > > One sets the file allocater to use whatever method one wishes (default > is the stack (not movable), and doesn't worry about it after that. It > is the 'lock object's job to lock the object down to a known address > when accessing it, otherwise the object can float freely. > > Correctly, according to their documentation, this takes a long while to > get used to (about 5 solid nights of work for me). The documentation is > a bit lean. However, it does work nicely. > > Please don't flame me if I missed the point of discussion. Well, I'm not sure either, but I think there's several different points floating around here. My claims are: rather than making 'this' even more general [adding a level of indirection] 'this' should be made even less general. IE 'this' should not be assigned to, and should not change. --If we had it to do over I think I'd like 'this' to be a reference and not a pointer at all. Also, I claim you can 'easily' make a language that consistently uses handles, [like Smalltalk] or consistently doesn't [like C++] but its hard to see how one can mix approaches within one inheritence hierarchy [or DAG] without great difficulties or restrictions. Also, in C++ it would seem to be easy to make a base class 'moveable_objects' that would help support the moving needed by compaction, but then clearly there's going to be restrictions on what one can do with those objects. For instance, you'd probably want to prevent people from taking the address of a 'moveable_object.' Taking this approach would 'solve' the GC via copying problem for objects of classes in this hierarchy. You couldn't use objects not derived from 'moveable_objects without reintroducing GC problems. And saying the 'moveable_objects' base class 'solves' the problem is just saying for a particular range of problems the speed/space tradeoffs implied by the restrictions imposed by 'moveable_objects' is acceptible. Its probably very easy to find counterexample classes that will 'break' any particular memory management approach. Still, the ability to define a base class 'moveable_objects' in unmodified C++ shows the language is up to the task of meeting the needs of these people. The preceading paragraph just really restates that memory management is best handled as a class related issue -- not as a language issue. Unfortunately, its tough to find ways to use differing memory management schemes and then be able to 'paste' together classes using disparent schemes via multiple-inheritence.