[comp.lang.c] malloc impossible?

gwyn@smoke.BRL.MIL (Doug Gwyn ) (01/12/89)

In article <15406@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
>Actually, I was thinking about machines in which different types
>of objects must come from different address spaces (e.g., all
>`pointer' objects should be in the range 0x70000000..0x7c000000).

I still don't understand your example.  If C is to be implemented
on any architecture, the implementor has to arrange for pointers
to data objects to have the right properties, independently of
whether malloc looks something like:
	#define POOLMAX 10000
	static char pool[POOLMAX];
	static char *next = pool;
	void *malloc(size_t n) {
		register char *p;
		if (next - pool > POOLMAX - n)
			return NULL;
		p = next;
		next += n;	/* assume byte alignment is ok */
		return (void *)p;
	}
(Note -- I realize this is not a GOOD way to implement malloc, but
it is a POSSIBLE way, suitable for purposes of this discussion.
Naturally, one should implement malloc() better than this if at all
possible.)

Such an implementation of malloc() must work if the C language works.
That's why I claim that malloc() is implementable on all architectures
for which an implementation of C is feasible.

chris@mimsy.UUCP (Chris Torek) (01/12/89)

>In article <15406@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
>>Actually, I was thinking about machines in which different types
>>of objects must come from different address spaces (e.g., all
>>`pointer' objects should be in the range 0x70000000..0x7c000000).

In article <9342@smoke.BRL.MIL> gwyn@smoke.BRL.MIL (Doug Gwyn) writes:
[malloc example deleted]
>Such an implementation of malloc() must work if the C language works.

Sure---by definition of `the C language'.

>That's why I claim that malloc() is implementable on all architectures
>for which an implementation of C is feasible.

But I claim that, if the requirement for malloc() were dropped, one
could implement C-minus-malloc on a machine that (say) required all
floating point numbers to live in the floating-point address space,
by having two stacks (one in the integer space and one in the fp space)
and two data segments (likewise).  The point is that it also needs
two heaps, and the single argument one provides to malloc() does not
provide the information necessary to decide which heap to use:

	void *
	weird_machine_malloc(size_t size, enum Space_selector space) {
		static char intspacepool[INTSIZE];
		static float floatspacepool[FLOATSIZE];
		static size_t intfree = INTSIZE;
		static size_t floatfree = FLOATSIZE;

		if (space == Space_float) {
			if (size % (sizeof(float)-1) || size > floatfree)
				return NULL;	/* bad size */
			... allocate from floatspacepool ...
		} else {
			... allocate from intspacepool ...
		}
	}

One can imagine a machine with different spaces for each different
intrinstic data type, where malloc() has to select from one of 8
or 16 possibilities.  Of course, since malloc() gets only one argument,
it has no way to decide which address space to use.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

gwyn@smoke.BRL.MIL (Doug Gwyn ) (01/13/89)

In article <15427@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
>But I claim that, if the requirement for malloc() were dropped, one
>could implement C-minus-malloc on a machine that (say) required all
>floating point numbers to live in the floating-point address space,
>by having two stacks (one in the integer space and one in the fp space)
>and two data segments (likewise).

Assuming that integers could not also live in floating-point space,
C would not be implementable on such a hypothetical architecture.
Consider
	union	{
		double	d;
		int	i;
	}	u;
	u.d = 123.0;
	printf("%d\n", u.i);
This is required to work, although the specific value printed of course
depends on details of numeric representation on the specific system.
All data object types in C must be able to live in the same kind of space.
Whatever kind of space that is, is what one would parcel out via malloc().
Therefore malloc() is implementable if C is (at the very least, along the
lines I indicated in my previous example implementation).

jas@ernie.Berkeley.EDU (Jim Shankland) (01/13/89)

In article <9351@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>Assuming that integers could not also live in floating-point space,
>C would not be implementable on such a hypothetical architecture.
>Consider
>	union	{
>		double	d;
>		int	i;
>	}	u;
>	u.d = 123.0;
>	printf("%d\n", u.i);
>This is required to work, although the specific value printed of course
>depends on details of numeric representation on the specific system.
>All data object types in C must be able to live in the same kind of space.

Well, this is getting pretty esoteric, I suppose, but if I were implementing
(C - malloc) on such a machine, couldn't I reserve space for this union
in both fp space and integer space?  The value ouput by the printf above
would, of course, be entirely unaffected by the preceding assignment; but
as you say, the value printed is system-specific, anyway.  In other words,
are unions *required* to overlay their members onto the same physical
storage?

Jim Shankland
jas@ernie.berkeley.edu

"I've been walking in a river all my life, and now my feet are wet"

dik@cwi.nl (Dik T. Winter) (01/13/89)

Considering a machine with separate int and float space:

Doug Gwyn (VLD/VMB) <gwyn> writes:
 >	union	{
 >		double	d;
 >		int	i;
 >	}	u;
 >	u.d = 123.0;
 >	printf("%d\n", u.i);
 >This is required to work, although the specific value printed of course
 >depends on details of numeric representation on the specific system.
 >All data object types in C must be able to live in the same kind of space.
 > 
Jim Shankland answers:
 > Well, this is getting pretty esoteric, I suppose, but if I were implementing
 > (C - malloc) on such a machine, couldn't I reserve space for this union
 > in both fp space and integer space?  The value ouput by the printf above
 > would, of course, be entirely unaffected by the preceding assignment; but
 > as you say, the value printed is system-specific, anyway.  In other words,
 > are unions *required* to overlay their members onto the same physical
 > storage?
 > 
Still more esoteric:

More is required depending on ANSI C requirements.  For instance should
	*((double *) (&u))
be identical to u.d?
And what about
	*((double *) (int *) (&u))
I think the latter is not required; but if it is the system will have
problems.  The system will also have problems if you cast through void*;
which is required to work I think.

An easy solution is to allocate space (through malloc or through local
variables) in both fp and integer space, and take the proper space
depending on context.  But this implies that malloc is implementable
(at a cost).
-- 
dik t. winter, cwi, amsterdam, nederland
INTERNET   : dik@cwi.nl
BITNET/EARN: dik@mcvax

chris@mimsy.UUCP (Chris Torek) (01/13/89)

>In article <15427@mimsy.UUCP> I wrote:
>>... I claim that, if the requirement for malloc() were dropped, one
>>could implement C-minus-malloc on [more kinds of machines than now possible].

In article <9351@smoke.BRL.MIL> gwyn@smoke.BRL.MIL (Doug Gwyn ) writes:
>Assuming that integers could not also live in floating-point space,
>C would not be implementable on such a hypothetical architecture.
>Consider
>	union	{
>		double	d;
>		int	i;
>	}	u;
>	u.d = 123.0;
>	printf("%d\n", u.i);
>This is required to work, although the specific value printed of course
>depends on details of numeric representation on the specific system.

And the code generated would be `allocate one object in each space,
the double from F and the int from I' (that is, float and int spaces)
rather than `allocate one object'.  References to the union would
write both spaces; pointers to unions would be represented as pairs.
If we had a pointer to that union, the code generated for

	p->i = 123;

might be something like (in vaxish assembly code)

	movq	4(ifp),r0r1	| pick up both halves of `p'
		| r0 holds &p->i; r1 holds &p->d
	movl	$123,(r0:i)	| write 123 in integer region
	movd	$0d123.0,(r1:f)	| write 123.0 in float region

(The value written to the float region can almost be arbitrary, but
it seems `cute' to assume that a compiler for this peculiar machine
should convert the value being stored so as to wind up with the same
value in each field.  It makes the machine appear to use the same
bit patterns for integers and floating point...!  [Well, almost.])

Note that this implies that `void *' (and therefore `char *')
pointers would have to be 8 bytes wide, since union pointers
are actually pairs.

>All data object types in C must be able to live in the same kind of space.

Not in C-minus-malloc (and perhaps a few other functions I may have
forgotten).  I am fairly sure that the `core' part of the language
can get away without this.  varargs/stdarg does look hard.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

english@panarea.usc.edu (Joe English) (01/13/89)

In article <15427@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
>>In article <15406@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
>>>Actually, I was thinking about machines in which different types
>>>of objects must come from different address spaces (e.g., all
>>>`pointer' objects should be in the range 0x70000000..0x7c000000).
>
>In article <9342@smoke.BRL.MIL> gwyn@smoke.BRL.MIL (Doug Gwyn) writes:
>>Such an implementation of malloc() must work if the C language works.
>>That's why I claim that malloc() is implementable on all architectures
>>for which an implementation of C is feasible.
>
>But I claim that, if the requirement for malloc() were dropped, one
>could implement C-minus-malloc on a machine that (say) required all
>floating point numbers to live in the floating-point address space,
>by having two stacks (one in the integer space and one in the fp space)
>and two data segments (likewise).  [....]

Then how would you do this:

    struct impossible {
        int   i;
        float f;
        double *dp;
    } imp;

Even if the compiler were to allocate the three components of imp in
the approprate address spaces, what is &imp?  Does it point to imp.i,
imp.f, or imp.dp? Or a vector of pointers (in the pointer address
space, of course) which point to imp's components?  

Also, from H&S (1984) section 5.7.3, "C compilers are constrained to
assign components increasing memory addresses in a strict
left-to-right order, with the first component starting at the
beginning address of the structure itself."  (What's the dpANS say on
this?)  This constraint would have to go.  Even so, that leaves a *lot*
of padding between different-typed fields :-)

It looks like C would be at the very least difficult, if not outright
impossible, to implement on this machine with or without malloc().
The "struct" semantics (and probably many others) would have to change
radically.  "C-minus-malloc" wouldn't be much good unless it also
provided a "new" operator -- I don't know about you, but *I* don't
want to have to write my own malloc() routine along with every
struct!


      /|/| Joe English
-----< | |
  O   \|\| english%lipari@oberon.usc.edu

gwyn@smoke.BRL.MIL (Doug Gwyn ) (01/13/89)

In article <27499@ucbvax.BERKELEY.EDU> jas@ernie.Berkeley.EDU (Jim Shankland) writes:
>are unions *required* to overlay their members onto the same physical
>storage?

In effect, yes.

gwyn@smoke.BRL.MIL (Doug Gwyn ) (01/13/89)

In article <15445@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
>And the code generated would be `allocate one object in each space,
>the double from F and the int from I' (that is, float and int spaces)
>rather than `allocate one object'.  References to the union would
>write both spaces; pointers to unions would be represented as pairs.

Then, due to other constraints on copyability of entire objects
via punned pointers etc., you would have to "shadow" the whole
dual-data space (except where a really clever compiler could
determine that it could get away with not doing so).  malloc()
would still be implementable.

Fortunately I don't know of any architectures quite like that.
Tagged architectures come close, but methods of faking them out
have been developed.

friedl@vsi.COM (Stephen J. Friedl) (01/13/89)

> One can imagine a machine with different spaces for each different
> intrinstic data type, where malloc() has to select from one of 8
> or 16 possibilities.  Of course, since malloc() gets only one argument,
> it has no way to decide which address space to use.

Wow. Wouldn't structs be *really* interesting on a machine like this?

     Steve

-- 
Stephen J. Friedl        3B2-kind-of-guy            friedl@vsi.com
V-Systems, Inc.        I speak for me only      attmail!vsi!friedl
Santa Ana, CA  USA       +1 714 545 6442    {backbones}!vsi!friedl
---------Nancy Reagan on Hawaiian musicians: "Just say Ho"--------

chris@mimsy.UUCP (Chris Torek) (01/13/89)

[same old perverse machine with integer and fp spaces]

In article <14658@oberon.USC.EDU> english@panarea.usc.edu (Joe English) writes:
>Then how would you do this:
>
>    struct impossible {
>        int   i;
>        float f;
>        double *dp;
>    } imp;
>
>Even if the compiler were to allocate the three components of imp in
>the approprate address spaces, what is &imp?  Does it point to imp.i,
>imp.f, or imp.dp? Or a vector of pointers (in the pointer address
>space, of course) which point to imp's components?  

It would have to point to a vector of pointers; the actual contents
of the structure would really be

	int	*i;
	float	*f;
	double	*dp;

(it is only necessary to add a reference level to simple objects).

Grotesque?  You bet!

>... "C-minus-malloc" wouldn't be much good unless it also provided a
>"new" operator -- I don't know about you, but *I* don't want to have to
>write my own malloc() routine along with every struct!

I agree.  But I think that a standards committee might shy away
from adding malloc to a C-like language that did not already have
malloc (in fact, they would almost certainly add a `new' instead).

Standards committtees tend to be overly conservative.  (Although they
can be radical sometimes, as witness `noalias'.)
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

tainter@ihlpb.ATT.COM (Tainter) (01/15/89)

In article <9351@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>Assuming that integers could not also live in floating-point space,
>C would not be implementable on such a hypothetical architecture.

Sure it can.  Just not efficiently.

>Consider
>	union	{
>		double	d;
>		int	i;
>	}	u;
>	u.d = 123.0;
>	printf("%d\n", u.i);
>This is required to work, although the specific value printed of course
>depends on details of numeric representation on the specific system.
>All data object types in C must be able to live in the same kind of space.
>Whatever kind of space that is, is what one would parcel out via malloc().
>Therefore malloc() is implementable if C is (at the very least, along the
>lines I indicated in my previous example implementation).

This isn't even required to print anything useful.

To support this stuff you just give everything an indirection.

I haven't thought it completely through but I think this can be fleshed out
and even support malloc().


--johnathan.a.tainter

gwyn@smoke.BRL.MIL (Doug Gwyn ) (01/16/89)

In article <1010@vsi.COM> friedl@vsi.COM (Stephen J. Friedl) writes:
>> One can imagine a machine with different spaces for each different
>> intrinstic data type, where malloc() has to select from one of 8
>> or 16 possibilities.  Of course, since malloc() gets only one argument,
>> it has no way to decide which address space to use.
>Wow. Wouldn't structs be *really* interesting on a machine like this?

Don't worry, unless all those spaces are "shadowed", C cannot be
implemented on it.

X3J11 was very careful to ensure that memcpy() could be implemented
portably in ANSI C in the obvious way.

The talk about different kinds of data spaces is thus a red herring.

john@frog.UUCP (John Woods) (01/18/89)

In article <9384@smoke.BRL.MIL>, gwyn@smoke.BRL.MIL (Doug Gwyn ) writes:
I> In article <1010@vsi.COM> friedl@vsi.COM (Stephen J. Friedl) writes:
n> >> One can imagine a machine with different spaces for each different
t> >> intrinstic data type, where malloc() has to select from one of 8
e> >> or 16 possibilities.  Of course, since malloc() gets only one argument,
r> >> it has no way to decide which address space to use.
p> >Wow. Wouldn't structs be *really* interesting on a machine like this?
r> 
e> Don't worry, unless all those spaces are "shadowed", C cannot be
t> implemented on it.

Actually, the real solution to such a problem is to write a PDP-11 simulator
in the native machine code, and use a PDP-11 compiler for the newly-produced
virtual machine.  The ANSI spec doesn't DEMAND that floating-point operations
use any available floating point hardware, does it?

If an architecture is going to be that bizarre, C is not necessarily going to
be the language of choice for it.
-- 
John Woods, Charles River Data Systems, Framingham MA, (508) 626-1101
...!decvax!frog!john, john@frog.UUCP, ...!mit-eddie!jfw, jfw@eddie.mit.edu

Go be a `traves wasswort.		- Doug Gwyn

gwyn@smoke.BRL.MIL (Doug Gwyn ) (01/19/89)

In article <1363@X.UUCP> john@frog.UUCP (John Woods) writes:
>The ANSI spec doesn't DEMAND that floating-point operations
>use any available floating point hardware, does it?

No, but I bet the customers do.

mouse@mcgill-vision.UUCP (der Mouse) (01/28/89)

In article <9351@smoke.BRL.MIL>, gwyn@smoke.BRL.MIL (Doug Gwyn ) writes:
> In article <15427@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
>> But I claim that, if the requirement for malloc() were dropped, one
>> could implement C-minus-malloc on a machine that (say) [had distinct
>> integer and float data and stack spaces]
> Assuming that integers could not also live in floating-point space, C
> would not be implementable on such a hypothetical architecture.
> Consider:   union { double d; int i; } u.d=123; printf("%d\n",u.i);
[example compressed -dM]
> This is required to work, although the specific value printed of
> course depends on details of numeric representation [...].

As I recall hashing out in email with someone-or-other a while ago,
there appears to be no requirement that u.d and u.i share any storage
at all.  That is, the printed value need not bear any relation at all
to the bit pattern used to represent the double-floating value 123.

Is it really required to even print anything?  K&RV2 says the results
"are implementation-dependent if something is stored as one type and
extracted as another".  Does the Proposal say anything essentially
different?  Sounds as though "Segmentation fault - core dumped" is a
technically legal response.  (More likely, something like "CChecker:
warning: wrong type retrieved from union at line 5" :-)

Someone else mentioned memcpy() and related routines as being another
reason C wouldn't be implementable on such an architecture.  Is this
really true?  The arguments will be cast to void *, but I can't find
any requirement that the resulting pointers can be (re-cast and)
dereferenced meaningfully as any type but the original type.  Indeed, I
find no requirement that the following print 0 rather than 1:

main() { int i; float f; printf("%d\n",((void *)&i)==((void *)&f)); }

(My reference is K&RV2; until machine-readable copies of the draft or
proposal or standard or whateveritisthisweek become available, it will
remain so.  I mean, really, presumably this standard is supposed to be
used, so what do they do but refuse to use the best available means of
inews: flames inappropriate for this group - posting truncated

					der Mouse

			old: mcgill-vision!mouse
			new: mouse@larry.mcrcim.mcgill.edu

gwyn@smoke.BRL.MIL (Doug Gwyn ) (01/28/89)

In article <1404@mcgill-vision.UUCP> mouse@mcgill-vision.UUCP (der Mouse) writes:
>Indeed, I find no requirement that ...

Well, naturally if you don't LOOK in the proposed Standard you're
not going to find what it requires!