[comp.sys.mac.programmer] How to do garbage collection in C

mcdonald@fornax.UUCP (Ken Mcdonald) (12/02/89)

I'm programming an applica[tion which has need of a built-in garbage
collector.  Specifically, I need a garbage collector that can allocate
arbitrarily-sized blocks, resize blocks, and do automatic reclamation of
unused blocks--I think that mark-scan reclamation is best for my purposed,
although if anyone knows of anything better, I'd appreciate hearing about
it.  (Note that I'm using the term "garbage collector" rather sloppily,
to include the dynamic memory allocator as well.)

Conceptually this is easy, I just use the Mac memory manager to allocate
a large block, and then do all of my own allocation in it.  The problem I'm
facing right now is in trying to program the reclamation algorithm in C--in
order to mark the unused blocks, I need to crawl along the stack, following
out any pointers that happen to have been pushed onto the stack.  This brings
up several problems, namely:  How to explicitly access C's stack; how to
differentiate pointers from other data; how to determine when the end of the
stack has been reached.  I'm a novice C programmer, and would appreciate
tips on these from people more skilled than myself.

On the other hand, if anyone happens to have a garbage collector just
lying around that would do everything I need, I wouldn't mind a copy of
that!
Thanks for all the help, netland . . .
Ken McDonald
{mcdonald@lccr.cs.sfu.ca}

oster@dewey.soe.berkeley.edu (David Phillip Oster) (12/03/89)

In article <153@fornax.UUCP> mcdonald@fornax.UUCP (Ken Mcdonald) writes:
>This brings up several problems, namely:  
>How to explicitly access C's stack;
This is easy: just do &localVariable

>how to differentiate pointers from other data; 
Don't bother to try. If a bit pattern represents a pointer into your
private heap, assume that it really is one.  (I.e., it represents an
address between the start and end of your private heap. It may have the
wrong alignment because it is looking _inside_ one of your objects,
instead of pointing to the beginning of it, so you'll need to search your
heap to see if it actually points to a real object.)

The number of false matches will be negligable, and you'll simplify all 
your algorithms.  Since you can't tell if it is really a pointer or just a
coincidence, you won't be able to move the object pointed to, because you
won't be able to change the pointer, in case the match was just a
coincidence.

Alternatively, instead of passing pointer as arguments, or storing them in
local variables, create a structure:

struct {
	long	flag1;
	Ptr	myPtr;
	long	flag2;
};
and initialize the flag longs to some distinctive pattern like 0xF0F0F0F0.
Where you would normally pass or store a pointer, pass one of these, and
always uses these structures for local variables.
Then you can search the stack for sequences of 96bytes that start and end
with four bytes of flag. You could still get coincidence matches, but now
the chances are much less likely, particularly if you clear the flag words
in the locals before you return, so they aren't just hanging around,
garbage on the stack.

If you want to garbage collect as an interrupt task (a complex business.)
don't forget the 68000 registers may point into your heap.

Remember, the Mac pushes mixes of shorts and longs on the stack, so you'll
need to consider each 16-bit chunk first as the high part of the 32-bit
word it starts, second as the low part of the 32-bit chunk it ends.


>how to determine when the end of the stack has been reached.  
At the beginning of the program do:
Ptr stackBase;
main(){
	int dontCare;
	stackBase = (Ptr) &dontCare;
}

This guarantees that the portion of the stack currently in use is between
&localVariable and stackBase. Remember that the stack grows toward low
memory.  stackBase is a bigger (unsigned) number that &localVariable.

Have you considered giving up and recasting your problem to use reference
counts?

I used to be a lisp programmer, and my C programs show that fact. I've
been programming the Mac for five years and written dozens of programs for
it, and I've never needed a garbage collector. The Mac memory manager has
always suited me just fine.  I did build a little package of debugging
routines that let me log each NewHandle with what NewHandle returned and
who called it, and let me delete from the log when DisposHandle() happens
and print a report of which routine is responsible for which undisposed
handle. By being careful to place and activate these routines, I catch my
storage leakage problems. If you deallocate all your temporaries, you
don't need a garbage collector, but much more important than that: Storage
leakage bugs are almost always symptoms of _other_ bugs. If a garbage
collector hid my storage leakage bugs, I'd produce lower quality code
because I would be less likely to find those other bugs.

> The mac is a detour in the inevitable march of mediocre computers.
> drs@bnlux0.bnl.gov (David R. Stampf)
--- David Phillip Oster          -master of the ad hoc odd hack. 
Arpa: oster@dewey.soe.berkeley.edu 
Uucp: {uwvax,decvax}!ucbvax!oster%dewey.soe.berkeley.edu 

ge@kunivv1.sci.kun.nl (Ge' Weijers) (12/04/89)

About your GC problem. There is still a way to do it if you want, although
Your C source code needs to be adapted for this method.
What you do is to allocate a root stack, that is used by the garbage collector
to find references on. If you call the garbage collector (or a routine that
might call the garbage collector) you have to save and restore root pointers
to this stack. An example:

any_routine(HeapPointer p1)
{
	HeapPointer p2 = .....; /* a C pointer into the GC heap */

	SavePointer(p1);
	SavePointer(p2);

	RoutineCallingGC();

	/* p1 and p2 might point to garbage. restore them */

	p2 = GetPointer();
	p1 = GetPointer();
}

This is not the ultimate in efficiency, but if you implement SavePointer and
GetPointer using macros it might have a reasonable performance. Also check
for stack overruns. It's the only way I know of to implement a compacting GC
for C. Sigh.

Ge' Weijers
Ge' Weijers                                    Internet/UUCP: ge@cs.kun.nl
Faculty of Mathematics and Computer Science,   (uunet.uu.net!cs.kun.nl!ge)
University of Nijmegen, Toernooiveld 1         
6525 ED Nijmegen, the Netherlands              tel. +3180612483 (UTC-2)