[net.lang.c] Alignment of malloc return value

sam@delftcc.UUCP (Sam Kendall) (04/29/86)

In article <219@cad.UUCP>, faustus@cad.UUCP writes:
> In article <141@delftcc.UUCP>, sam@delftcc.UUCP (Sam Kendall) writes:
> > Currently, `malloc(1)' must return a maximally
> > aligned pointer!  This prevents an implementation which does something
> > space-efficient with small allocations.
> 
> "Maximally aligned" generally means aligned to a multiple of the wordsize of
> the machine, so you don't lose much space (malloc ususally allocates an
> extra word anyway before the space for a pointer anyway...)  I guess that
> you wouldn't lose much if you allowed malloc(1) to return an unaligned
> byte, but it isn't worth the trouble...

Doug Gwyn <gwyn@brl.arpa> responded similarly to faustus@cad.UUCP, so
I'll elaborate on what I meant.

A common allocation algorithm, not used in standard UNIX malloc(3) but
perhaps available in sVr2's malloc(3X) or 4.2 BSD's malloc(3), is to
preallocate a contiguous set of pages for each blocksize below some
small threshold.  Thus one set of pages would be allocated for 1-byte
blocks, another set of pages for 2-byte blocks, and so on.  (Above the
threshold, some other algorithm is used.)  You can tell by the address of
an area how large it is, simply by checking which set of pages it is
in.  You can keep track of which blocks are allocated by using a bitmap.
If you are allocating many small blocks, this algorithm surely is "worth
the trouble", since it is extremely fast and wastes no space.  It is
particularly good on a paging OS.

The current man pages for malloc(3) and malloc(3X) prohibit such an
algorithm, because they say that

        Each of the allocation routines returns a pointer to space
        suitably aligned (after possible pointer coercion) for storage
        of any type of object.

And this implies: even one larger than the allocated space.  This is
true for V7 malloc, but not for the algorithm described above.  I'm just
suggesting that the man page be corrected to say, "...  small enough to
fit in the allocated space", or something like that.  I wasn't very
clear about this suggestion in my previous posting; thanks to alice!ark
<Andrew Koenig> for clarifying this.

There is a more general point lurking here: a man page should describe a
function's _interface_.  The man page may note additional constraints
imposed by the current _implementation_, but notes on the implementation
should be separated from the description of the interface.  The C
language sometimes encourages people to confuse the abstraction
(interface) with the implementation, and I think I see this confusion
reflected in the way UNIX is documented.  AT&T has taken some steps to
correct this confusion, probably as a side effect of writing the System
V Interface Definition.  Berkeley, on the other hand, gets worse with
each release.

So my complaint about malloc's man pages on various UNIX systems is
really that they describe the current implementations too specifically
(and perhaps incorrectly).  The DESCRIPTION should say less, leaving
room for other implementations.  Notes about the current implementation
should be in a separate NOTES section of the man page.

----
Sam Kendall			{ ihnp4 | seismo!cmcl2 }!delftcc!sam
Delft Consulting Corp.		ARPA: delftcc!sam@NYU.ARPA