[comp.lang.c] Heap Fragmentation

ttwang@polyslo.CalPoly.EDU (Thomas Wang) (09/27/89)

Is heap fragmentation in C a problem or non-problem?

I asked this question because I am in the middle of doing a garbage collector
for C++.  If heap fragmentation is not a problem, then I need not move the
heap objects to compact the heap.

Are there any estimate of the space wasted, or performance decrease due
to heap fragmentations?

 -Thomas Wang ("This is a fantastic comedy that Ataru and his wife Lum, an
                invader from space, cause excitement involving their neighbors."
                  - from a badly translated Urusei Yatsura poster)

                                                     ttwang@polyslo.calpoly.edu

mikes@Apple.COM (Mike Shannon) (09/28/89)

In article <1989Sep27.045926.12634@polyslo.CalPoly.EDU> ttwang@polyslo.CalPoly.EDU (Thomas Wang) writes:
>Is heap fragmentation in C a problem or non-problem?

Memory management is not part of the C language, it is part of the library
support in the underlying system.  When I last heard, there is an
effective scheme for
re-using de-allocated memory, which uses hashing.  It's my understanding that
de-allocated memory chunks are never handed back to the operating system,
but are kept in an array of queues where the chunk size doubles from one
queue to the next.  So, when you do a malloc(xxx), the queue which has a block
bigger than or equal to xxx in size is searched, and a block is broken in two
and you get part of it.  This means you don't get the overhead of a system
call.

	I personally have never had a problem with heap fragmentation under
UNIX.
	For performance reasons, if I am going to be allocating and
de-allocating memory of a single fixed size, I often maintain my own 'free'
lists and do my own allocation which checks my free list first before calling
malloc().
	In summary, I would say heap fragementation under UNIX is not a
problem.
-- 
			Michael Shannon {apple!mikes, mikes@apple.com}

gwyn@smoke.BRL.MIL (Doug Gwyn) (09/28/89)

In article <35076@apple.Apple.COM> mikes@Apple.COM (Mike Shannon) writes:
>In article <1989Sep27.045926.12634@polyslo.CalPoly.EDU> ttwang@polyslo.CalPoly.EDU (Thomas Wang) writes:
>>Is heap fragmentation in C a problem or non-problem?
>Memory management is not part of the C language, it is part of the library
>support in the underlying system.

The modern point of view is that there is no crisp distinction between
the language and the run-time support.  In a hosted implementation, the
forthcoming Standard requires that malloc()/realloc()/free() be supplied.
They could be implemented as language keywords (actually, as macros
defined in terms of reserved-name keywords) as well as library routines.

>It's my understanding that de-allocated memory chunks are never handed
>back to the operating system, ...

In a UNIX context, traditionally applications have mixed sbrk() calls
with (perhaps implicit) use of malloc(), so it would be unsafe for
a UNIX malloc()/free() implementation to shrink the heap in general.
In other contexts, for example in the Apple IIGS/Macintosh desktop
environment, it is desirable for memory no longer in use by active
processes to be immediately returned to the system resource pool.

>In summary, I would say heap fragementation under UNIX is not a problem.

It can be, but unless you're pushing the limits on maximum virtual
data space size it would take a rather unusual memory allocator usage
pattern.

Garbage-collecting equivalents of malloc()/free() have been devised
from time to time.  Rob Pike's Blit terminal (and descendants)
firmware relies very heavily on one, and Rob's "sam" text editor
uses a similar scheme.  If memory is a critical resource (as it is
for the terminals), a garbage-collecting allocator may be worthwhile.

mike@thor.acc.stolaf.edu (Mike Haertel) (09/28/89)

In article <11161@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
>In article <35076@apple.Apple.COM> mikes@Apple.COM (Mike Shannon) writes:
>>It's my understanding that de-allocated memory chunks are never handed
>>back to the operating system, ...
>
>In a UNIX context, traditionally applications have mixed sbrk() calls
>with (perhaps implicit) use of malloc(), so it would be unsafe for
>a UNIX malloc()/free() implementation to shrink the heap in general.

Not true.  GNU malloc(), of which I am the author, can shrink the heap.
(This is a new and different malloc(), distinct from the ancient Caltech
malloc() we have previously distributed.)  Callers can freely mix sbrk()
calls.  The algorithm is:

	if (sbrk(0) == mallocs_notion_of_top_of_heap)
		sbrk(negative_amount);

>>In summary, I would say heap fragementation under UNIX is not a problem.
>
>It can be, but unless you're pushing the limits on maximum virtual
>data space size it would take a rather unusual memory allocator usage
>pattern.

If you want to be absolutely sure of avoiding heap fragmentation, don't
trust any allocator (even mine :-) and write a dedicated allocator as
a front end to malloc(), that malloc()s huge and parcels them out as
needed, and later free()s them in a batch when appropriate.  The GNU
obstack macros implement one such allocation discipline.

>Garbage-collecting equivalents of malloc()/free() have been devised
>from time to time.  Rob Pike's Blit terminal (and descendants)
>firmware relies very heavily on one, and Rob's "sam" text editor
>uses a similar scheme.  If memory is a critical resource (as it is
>for the terminals), a garbage-collecting allocator may be worthwhile.

Yes.  If you know the types and roots of all your data structures,
this is straightforward.  Even if you don't you can write a
(nonportable) "conservative" garbage collector that scans all
variables, heap, stack, and registers looking for possible references.
Empirically such conservative allocators have been observed to
do a reasonably good job of reclaiming dead storage.
-- 
Mike Haertel <mike@stolaf.edu>
``There's nothing remarkable about it.  All one has to do is hit the right
  keys at the right time and the instrument plays itself.'' -- J. S. Bach

gwyn@smoke.BRL.MIL (Doug Gwyn) (09/30/89)

In article <6643@thor.acc.stolaf.edu> mike@thor.stolaf.edu () writes:
>In article <11161@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
>>... it would be unsafe for
>>a UNIX malloc()/free() implementation to shrink the heap in general.
>Not true.  GNU malloc(), of which I am the author, can shrink the heap.
>...  The algorithm is:
>	if (sbrk(0) == mallocs_notion_of_top_of_heap)
>		sbrk(negative_amount);

Well, I did say it's not safe to shrink it in general, not "in general,
it's not safe to shrink it".  In specific cases it can be safe.  Note,
however, that it's tricky to get right.  Consider the scenario:
	malloc() => sbrk(+)
	sbrk(+)
	malloc() => sbrk(+)
	free() => sbrk(-) by above algorithm
At this point malloc()/free() need to be able to determine where the
last malloc() arena heap-top is, to know that it's not at the current
sbrk(0).  You may have taken care of this extra bookkeeping requirement,
but it's easy to get wrong.  I put quite a bit of time a couple of
years ago into fixing all the bugs in the SVR2 malloc() (both of them);
I think that it did manage to keep synchronized with sbrk() although it
never tried to shrink the program break.

tps@chem.ucsd.edu (Tom Stockfisch) (09/30/89)

In article <11161@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:

>>In article <1989Sep27.045926.12634@polyslo.CalPoly.EDU> ttwang@polyslo.CalPoly.EDU (Thomas Wang) writes:
>>>Is heap fragmentation in C a problem or non-problem?

>>In summary, I would say heap fragementation under UNIX is not a problem.

>It can be, but unless you're pushing the limits on maximum virtual
>data space size it would take a rather unusual memory allocator usage
>pattern.

I would disagree, at least with the Berkeley malloc package.  It can easily
waste a factor of 4 or 5.  The following excerpt is certainly not
unusual, yet malloc gets 45984 bytes from the operating system to
satisfy an 8192 byte need:


	int	size =	1;
	char	*p =	malloc(size);

	while (size <= 8192)
	{
		p =	realloc( p, size );
		size *=	2;
	}

I find my programs that need a significant amount of memory 
generally consume 5-8 times as much memory as they
need theoretically.

The Berkeley malloc has, essentially, a free list
for each size request, after adding 8 bytes of overhead and rounding up
to the nearest power of two with an 8 byte minimum.

The "worst case" behavior for this,
assuming an actual need of MAX, requires roughly

        MAX*(log (MAX) - 2)
                2

For instance, a program that needed only 8 meg might have malloc
asking for 320 meg from the operating system !!
-- 

|| Tom Stockfisch, UCSD Chemistry	tps@chem.ucsd.edu