[comp.lang.c] structure element offsets

bader@spice.cs.cmu.edu (Miles Bader) (11/24/86)

Is there any way of finding the offset of a structure element from
the beginning of a structure in a portable AND efficient way?  I have
a structure that looks like this:

struct fentry{
    FPOOL   *pool;
    union{
	FENTRY	*next;
	char    buf[1]; /* extended as needed ... */
    }	    data;
};

And I want to find the base of the structure given address of buf.
My first one went like:

static fentry calc;
#define FALLOC_OFFSET    ((long)calc.data.buf-(long)&calc)

This worked, and with pcc on an sun, compiled into the correct
constant, 4.  However on an ibm rt (under both pcc and another
non-pcc compiler), it did the calculation at run time (despite the
fact that the two quantities being subtracted were the same address
with offsets differing by 4).

I have given up and defined this as the constant 4 (which it is on
any machine around here), but am just curious if there's a better
way.

				    -Miles

hahn@fred (Jonathan Hahn) (11/26/86)

In article <1096@spice.cs.cmu.edu> bader@spice.cs.cmu.edu (Miles Bader) writes:
>Is there any way of finding the offset of a structure element from
>the beginning of a structure in a portable AND efficient way?

Try:

#define OFFSET(elem, type)	(&(((type *)0)->elem))

This utilizes a pointer of address 0, for which the address of the
element reference yeilds the offset of the element.

-jonathan hahn
hahn@ames-nas.arpa

stuart@bms-at.UUCP (Stuart D. Gathman) (11/26/86)

In article <1096@spice.cs.cmu.edu>, bader@spice.cs.cmu.edu (Miles Bader) writes:
> Is there any way of finding the offset of a structure element from
> the beginning of a structure in a portable AND efficient way?  I have

--------spos.h---------------------------
/*
  structure info macros
  spos	- member offset
  sposa	- array member offset (since &array often illegal)
  slen	- member size
  smp	- convert pointer to member to pointer to enclosing structure
  smpa	- smp for array member

  NOTE - these macros assume sizeof(char) == 1.  If the nerds win and
  we have to do it, this file needs to be fixed.

  This file can also be tweaked for stupid compilers that like to evaluate
  constants at run time.
*/
#define spos(s,m)	((char *)&((struct s *)0)->m - (char *)0)
#define sposa(s,m)	((char *)((struct s *)0)->m - (char *)0)
#define slen(s,m)	(sizeof((struct s *)0)->m)
#define smp(s,m,p)	((struct s *)(p-&((struct s *)0)->m))
#define smpa(s,m,p)	((struct s *)(p-((struct s *)0)->m))
-- 
Stuart D. Gathman	<..!seismo!dgis!bms-at!stuart>

daveb@rtech.UUCP (11/26/86)

Keywords:

In article <768@nike.UUCP> hahn@ames-nas.arpa (Jonathan Hahn) writes:
>In article <1096@spice.cs.cmu.edu> bader@spice.cs.cmu.edu (Miles Bader) asks:
>>Is there any way of finding the offset of a structure element from
>>the beginning of a structure in a portable AND efficient way?
>
>Try:
>
>#define OFFSET(elem, type)	(&(((type *)0)->elem))
>
>This utilizes a pointer of address 0, for which the address of the
>element reference yeilds the offset of the element.

Efficient, yes.  Portable, no. There are numerous compilers that choke
on these expressions.  One can argue that those aren't "real C
compilers", but the real question is, "Do you really need to do this?" 
Sometimes compiler holes are God's way of saying you're doing something
inadvisable.  You will have be tricky, obfuscated, type-incorrect code,
with gobs of casting back and forth with ptrs to manipulate the
elements.  Having dealt with this sort of code, I suggest you seek
another approach to the problem -- "that way lies madness".

-dB
-- 
{amdahl, sun, mtxinu, cbosgd}!rtech!daveb

rbutterworth@watmath.UUCP (11/27/86)

In article <768@nike.UUCP>, hahn@fred (Jonathan Hahn) writes:
> > Is there any way of finding the offset of a structure element from
> > the beginning of a structure in a portable AND efficient way?
> Try:
> #define OFFSET(elem, type)    (&(((type *)0)->elem))
> This utilizes a pointer of address 0, for which the address of the
> element reference yeilds the offset of the element.

typedef struct { int f1; int f2; } Str;

I tried OFFSET(Str,f2) on my machine and got 262,144 (=01000000 =2^18).
That's a pretty big offset considering it only has to pass
over one int.
I won't mention what Lint had to say about it.

This macro produces a POINTER.
I believe that what was requested was an INTEGER byte offset.

What you have to do is take the pointer produced above,
cast it into some generic type pointer (e.g. (char*)),
and subtract from it another generic pointer to the beginning
of the structure.  The difference of two pointers of the same
type is then an integral type, and in this case produces a
somewhat more realistic value of 4.

  (   ((char*)(&(((Str*)0)->f2)))
    - ((char*)((Str*)0))
  )

dave@viper.UUCP (David Messer) (11/29/86)

In article <3622@watmath.UUCP> rbutterworth@watmath.UUCP (Ray Butterworth) writes:
 >In article <768@nike.UUCP>, hahn@fred (Jonathan Hahn) writes:
 >> > Is there any way of finding the offset of a structure element from
 >> > the beginning of a structure in a portable AND efficient way?
 >> Try:
 >> #define OFFSET(elem, type)    (&(((type *)0)->elem))
 >> This utilizes a pointer of address 0, for which the address of the
 >> element reference yeilds the offset of the element.
 >
 >typedef struct { int f1; int f2; } Str;
 >
 >I tried OFFSET(Str,f2) on my machine and got 262,144 (=01000000 =2^18).
 >That's a pretty big offset considering it only has to pass
 >over one int.
 >I won't mention what Lint had to say about it.
 
It could be because you used OFFSET(Str,f2) instead of the correct
OFFSET(f2,Str).  You got the parameters reversed.

A simpiler definition of the OFFSET macro is the following:

	#define OFFSET(mos)  ((long)(&(((char *)0)->mos)))

This will produce a proper offset on almost all machines.  (But
not all, some machines have different formats to pointers to
different types.  Also, this macro assumes that (long)((char *)0) == 0L.)
-- 
Disclaimer:                       | David Messer 
I'm always right and I never lie. | Software Consultant 
My company knows this and agrees  | UUCP:  ihnp4!quest!viper!dave 
with everything I say.            |        ihnp4!meccts!viper!dave

throopw@dg_rtp.UUCP (Wayne Throop) (12/01/86)

> stuart@bms-at.UUCP (Stuart D. Gathman)
>> bader@spice.cs.cmu.edu (Miles Bader)

>> Is there any way of finding the offset of a structure element from
>> the beginning of a structure in a portable AND efficient way?
> #define spos(s,m)	((char *)&((struct s *)0)->m - (char *)0)
> #define sposa(s,m)	((char *)((struct s *)0)->m - (char *)0)
> #define slen(s,m)	(sizeof((struct s *)0)->m)
> #define smp(s,m,p)	((struct s *)(p-&((struct s *)0)->m))
> #define smpa(s,m,p)	((struct s *)(p-((struct s *)0)->m))

Stuart's solutions perhaps come closest to satisfying the original
question, but a couple of points remain.  First, there is no guarantee
that pointer arithmetic or offset calculations will work for the null
pointer.  Second, the notion of "offset" is ill defined in the original
question.  Stuart's solution provides the offset in char-sized units,
and this is probably what Miles meant, but it is well to remember that
the notion offset-of-struct-member-in-char-sized-chunks is probably
not something that "ought" to be floating around in code meant to be
portable or maintainable.

The way the question was originally put (requiring a portable solution),
the only way to do it (assuming that offset-in-sizeof-sized-chunks is
wanted) is to create an instance of the structure at a non-nil address
(eg, create an external struct of the required type) and do the offset
calculation as Stuart does above, but with the actual structure.

--
Sometimes I think the only universal in the computing field is the
fetch-execute cycle.
                                --- Alan J. Perlis
-- 
Wayne Throop      <the-known-world>!mcnc!rti-sel!dg_rtp!throopw

rbutterworth@watmath.UUCP (Ray Butterworth) (12/01/86)

In article <386@viper.UUCP>, dave@viper.UUCP (David Messer) writes:
> It could be because you used OFFSET(Str,f2) instead of the correct
> OFFSET(f2,Str).  You got the parameters reversed.
I must have quoted it wrong in the news article.  Sorry.
The other order seems to fit more naturally with the C language.

> A simpiler definition of the OFFSET macro is the following:
> 	#define OFFSET(mos)  ((long)(&(((char *)0)->mos)))
> This will produce a proper offset on almost all machines.  (But
> not all, some machines have different formats to pointers to
> different types.  Also, this macro assumes that (long)((char *)0) == 0L.)

"on almost all machines" means it is totally wrong on some machines.
What is the point of making something simpler if it's going to be
wrong?
How about "#define OFFSET(mos) 4"?  That's really simple and will
also be correct sometimes.
What is the point of bothering to assume that (long)(char*)0 is 0?
Why not simply subtract whatever (char*)0 really is from the other
pointer and always get the correct answer on all machines (as I
originally suggested)?

On any one machine, the simplest and most complicated forms of this
macro will generate exactly the same code (assuming both are correct).
What is it you think you are gaining by "simplifying" it?

Besides, what makes you think you can get away with ((char*)0)->mos?
Compilers which accept this (e.g. BSD) give this warning:
"xxx.c", line 15: warning: struct/union or struct/union pointer required
And if two structures should happen to have members with the same name:
"xxx.c", line 15: nonunique name demands struct/union or struct/union pointer

> I'm always right and I never lie. | Software Consultant 
> My company knows this and agrees  | UUCP:  ihnp4!quest!viper!dave 
> with everything I say.            |        ihnp4!meccts!viper!dave

With advice like that, I pity your clients.
If it is a company of more than one, I hope the owners know
what excellent advertising you are providing for them.

henry@utzoo.UUCP (Henry Spencer) (12/04/86)

> ...In other words a cast such as (type1 *)(type2 *)x
> will not always give a meaningful answer...

True.

> According to K&R all
> that is required is that (type *)(long *)x == x.

False.  Please cite chapter and verse.  I believe what you are thinking
of is (type *)(char *)x == x, which is (by K&R) valid.  Your example falls
down if, for example, "type" is "char" and the format of "long *" is not
precise enough to point to an individual char within a long.

Or perhaps you were thinking of (type *)(long)x == x ?  That would make me
nervous but it is technically valid.

> ... for good or ill,
> C has been defined such that all members-of-structures share the
> same name-space...

Not modern C, which puts each structure's members in a separate name space.
-- 
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,decvax,pyramid}!utzoo!henry

karl@haddock.UUCP (Karl Heuer) (12/05/86)

In article <720@dg_rtp.UUCP> throopw@dg_rtp.UUCP (Wayne Throop) writes:
>>bader@spice.cs.cmu.edu (Miles Bader) writes:
>>> Is there any way of finding the offset of a structure element from
>>> the beginning of a structure in a portable AND efficient way?

>stuart@bms-at.UUCP (Stuart D. Gathman) writes:
>> #define spos(s,m)	((char *)&((struct s *)0)->m - (char *)0)
>
>... A couple of points remain.  First, there is no guarantee that pointer
>arithmetic or offset calculations will work for the null pointer.

Note also that the result might not be a legal constant.

However, ANSI seems to have its collective heart set on making offsetof()
part of the standard, which means that on machines where the above construct
doesn't work, *some* hook must be allowed -- even if the implementor has to
make "offsetof" a builtin operator instead of a macro.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
(Btw, "sposa(s,m)" is redundant since "spos(s,m[0])" will do, and the macro
would be more useful if the word "struct" were omitted.)

dave@viper.UUCP (David Messer) (12/05/86)

In article <3695@watmath.UUCP> rbutterworth@watmath.UUCP (Ray Butterworth) writes:
 >In article <386@viper.UUCP>, dave@viper.UUCP (David Messer) writes:
 >> A simpiler definition of the OFFSET macro is the following:
 >> 	#define OFFSET(mos)  ((long)(&(((char *)0)->mos)))
 >> This will produce a proper offset on almost all machines.  (But
 >> not all, some machines have different formats to pointers to
 >> different types.  Also, this macro assumes that (long)((char *)0) == 0L.)
 >
 >"on almost all machines" means it is totally wrong on some machines.
 >What is the point of making something simpler if it's going to be
 >wrong?
 >How about "#define OFFSET(mos) 4"?  That's really simple and will
 >also be correct sometimes.
 >What is the point of bothering to assume that (long)(char*)0 is 0?
 >Why not simply subtract whatever (char*)0 really is from the other
 >pointer and always get the correct answer on all machines (as I
 >originally suggested)?
 >
 >On any one machine, the simplest and most complicated forms of this
 >macro will generate exactly the same code (assuming both are correct).
 >What is it you think you are gaining by "simplifying" it?
 >

Because your solution doesn't work all the time either.  There are
some machines in which pointers to different types are unrelated
in format.  In other words a cast such as (type1 *)(type2 *)x
will not always give a meaningful answer.  According to K&R all
that is required is that (type *)(long *)x == x.

Since one cannot make a macro that will work on a arbitrary
machine, other considerations apply when writing code such as
this.  For instance you assume that (x - (char *)0) will compile
to the same code as (x).  I have known some brain-damaged compilers
to actually generate code for the cast and the subtract.  There are
no machines to my knowledge where (long)(char *)0 != 0L and
other problems (such as casting of pointers) don't prevent
the offset macro to work.
 >
 >Besides, what makes you think you can get away with ((char*)0)->mos?
 >Compilers which accept this (e.g. BSD) give this warning:
 >"xxx.c", line 15: warning: struct/union or struct/union pointer required
 >And if two structures should happen to have members with the same name:
 >"xxx.c", line 15: nonunique name demands struct/union or struct/union pointer
I will conceed that you have a point here, however, for good or ill,
C has been defined such that all members-of-structures share the
same name-space.  If you have a compiler which gives you these
helpful hints, it is an easy matter to add a structure name
parameter to the macro.
-- 
Disclaimer:                       | David Messer 
I'm always right and I never lie. | Software Consultant 
My company knows this and agrees  | UUCP:  ihnp4!quest!viper!dave 
with everything I say.            |        ihnp4!meccts!viper!dave

rbutterworth@watmath.UUCP (12/05/86)

> According to K&R all
> that is required is that (type *)(long *)x == x.
K&R does not require this.  In fact it is not true on some machines
either with or without the second "*".

> There are
> some machines in which pointers to different types are unrelated
> in format.  In other words a cast such as (type1 *)(type2 *)x
> will not always give a meaningful answer.
True.  But I wasn't talking about (type1*), I was talking about
(char*).  As far as I know, any (type2*) can be cast into a (char*)
and back again without harm on any machine.  (otherwise how does
malloc() work?) And any two (char*) pointers can be subtracted to
produce the number of bytes between them.

> Because your solution doesn't work all the time either.

In casting a pointer to (char*), all the strange formats and sizes
of pointers are handled by the compiler.  The result is a pointer
to the first byte of the data.

In subtracting two (char*) pointers, all the strange formats and
sizes are handled by the compiler.  The result is an integer
(type (long), (unsigned), (int), ?  it doesn't really matter).

Thus ( ((char*)pointer1) - ((char*)(pointer2) ) is a completely
portable operation.  All knowledge of sizes and internal formats
is done by the compiler.  To cover all possiblilites, you should
also divide the result by (sizeof(char)).

So if you know of a compiler where my solution doesn't work,
it must violate one of the above two rules.  Which one?

> however, for good or ill,
> C has been defined such that all members-of-structures share the
> same name-space.
This hasn't been true for many years.
Are there still any compilers out there that can't handle this?
struct one {int a; int b;};
struct two {float b; float a};

dave@viper.UUCP (David Messer) (12/08/86)

In article <7377@utzoo.UUCP> henry@utzoo.UUCP (Henry Spencer) writes:
 >> According to K&R all
 >> that is required is that (type *)(long *)x == x.
 >
 >False.  Please cite chapter and verse.  I believe what you are thinking
 >of is (type *)(char *)x == x, which is (by K&R) valid.  Your example falls
 >down if, for example, "type" is "char" and the format of "long *" is not
 >precise enough to point to an individual char within a long.
 >
 >Or perhaps you were thinking of (type *)(long)x == x ?  That would make me
 >nervous but it is technically valid.

You are correct.  The second example is what I meant.  (I.e. a pointer
can be stored in a long variable and converted back with out changing
it's value.)  For those of you interested, the section of K&R that
I am looking at is 14.4 of Appendix A (C Reference Manual).  It is
interesting in that the only thing gaurenteed about casts on pointers
is the conversion to a long and back (actually, conversion to an
"integral" type (int or long) and back, where the choice of int or
long is machine-dependent.) and conversion to a pointer to a smaller
type and back, which also results in the same original pointer value.
The mapping from a pointer to a long (int) is explicitly machine
dependent, although it is supposed to be "unsurprising to those who
know the addressing structure of the machine."  One interesting
thing is, by these rules, the use of malloc(), for anything other
than char arrays, is non-portable.

 >
 >> ... for good or ill,
 >> C has been defined such that all members-of-structures share the
 >> same name-space...
 >
 >Not modern C, which puts each structure's members in a separate name space.

If so, most existing programs which use structures will not compile.
I haven't seen any compilers that insist on members-of-structures
being tied to a specific structure, although I know that the
"standardization" effort has it that way.  (What ever happened to
the days when a standards committee codified existing practises
rather than doing a re-engineering job?)
-- 
Disclaimer:                       | David Messer 
I'm always right and I never lie. | Software Consultant 
My company knows this and agrees  | UUCP:  ihnp4!quest!viper!dave 
with everything I say.            |        ihnp4!meccts!viper!dave

dave@viper.UUCP (David Messer) (12/08/86)

In article <3810@watmath.UUCP> rbutterworth@watmath.UUCP (Ray Butterworth) writes:
 >> however, for good or ill,
 >> C has been defined such that all members-of-structures share the
 >> same name-space.
 >This hasn't been true for many years.
 >Are there still any compilers out there that can't handle this?
 >struct one {int a; int b;};
 >struct two {float b; float a};

Well, I learn something new everyday!  The last time I checked
it (back about 5 years ago on Version 7) the statement I made
was true.  Now it is not.  The MS-DOS C compiler I used in that
time had all MOS' in the same name-space, so I never suspected that
they changed it in the real (UNIX) world.  I thank you for pointing
this out to me, and apologize for my earlier, stupid statements.

I think that I will stop citing Kernighan & Ritchie since the
world seems to have proceeded past them.

The following statement used to be true:
-- 
Disclaimer:                       | David Messer 
I'm always right and I never lie. | Software Consultant 
My company knows this and agrees  | UUCP:  ihnp4!quest!viper!dave 
with everything I say.            |        ihnp4!meccts!viper!dave

throopw@dg_rtp.UUCP (Wayne Throop) (12/08/86)

> dave@viper.UUCP (David Messer)
> for good or ill,
> C has been defined such that all members-of-structures share the
> same name-space.

You're wrong, you know.  Members of structures do NOT share the same
namespace in C, no more than =+ is an operator in C.  Let us chant
together, from Harbison and Steele, page 107:

    The names of structure components are defined in a special
    overloading class associated with the structure type.  That is,
    component names within a single structure must be distinct, but they
    may be the same as component names in other structures and may be
    the same as variable, function, and type names.

Kernighan and Ritchie have similar things to say, but I don't (GASP)
have my K&R to hand just now to find what they are, though I recall
vaguely that they explain that the historical crock that some compilers
(that aren't C compilers, mind you:-) make structure members of all
structure types a single namespace.  This historical oddity should not
be confused with a linguistic feature.

--
A LISP programmer knows the value of everything, but the cost of nothing.
                                --- Alan J. Perlis
-- 
Wayne Throop      <the-known-world>!mcnc!rti-sel!dg_rtp!throopw

ron@brl-sem.ARPA (Ron Natalie <ron>) (12/09/86)

In article <3810@watmath.UUCP>, rbutterworth@watmath.UUCP (Ray Butterworth) writes:
> > According to K&R all
> > some machines in which pointers to different types are unrelated
> > in format.  In other words a cast such as (type1 *)(type2 *)x
> > will not always give a meaningful answer.
> True.  But I wasn't talking about (type1*), I was talking about
> (char*).  As far as I know, any (type2*) can be cast into a (char*)
> and back again without harm on any machine.  (otherwise how does
> malloc() work?) And any two (char*) pointers can be subtracted to
> produce the number of bytes between them.
> 
What malloc provides is a (char *) that can be cast into other pointer
types.  There is no guarantee that all legitimate pointer values can
be represented by a (char *) cast of them.  Malloc is one of those very
machine dependant magic routines that is careful to return very "special"
character pointers that can be used as other data types, but to imply that
the compiler always allows character pointers to be used for this mode
is naive.

rbutterworth@watmath.UUCP (12/09/86)

In article <509@brl-sem.ARPA>, ron@brl-sem.ARPA (Ron Natalie <ron>) writes:
> What malloc provides is a (char *) that can be cast into other pointer
> types.  There is no guarantee that all legitimate pointer values can
> be represented by a (char *) cast of them.  Malloc is one of those very
> machine dependant magic routines that is careful to return very "special"
> character pointers that can be used as other data types, but to imply that
> the compiler always allows character pointers to be used for this mode
> is naive.

K&R must be naive then.  In 14.4 of the appendix:
"It is guaranteed that a pointer to an object of a given size may be
converted to a pointer to an object of a smaller size and back again
without change."

ballou@brahms (Kenneth R. Ballou) (12/10/86)

In article <736@dg_rtp.UUCP> throopw@dg_rtp.UUCP (Wayne Throop) writes:
>> C has been defined such that all members-of-structures share the
>> same name-space.
>You're wrong, you know.  Members of structures do NOT share the same
>namespace in C, no more than =+ is an operator in C.  Let us chant
>together, from Harbison and Steele, page 107:
	[an appropriate incantation has been deleted]
>Kernighan and Ritchie have similar things to say, but I don't (GASP)
>have my K&R to hand just now to find what they are, 
    For shame!  Actually, this is not true.  In K&R C structure members all
belong to the same namespace.  The separation of namespaces for each
structure/union is a (thoroughly welcome) later enhancement.

--------
Kenneth R. Ballou		ARPA: ballou@brahms.berkeley.edu
Department of Mathematics	UUCP: ...!ucbvax!brahms!ballou
University of California
Berkeley, California  94720

henry@utzoo.UUCP (Henry Spencer) (12/10/86)

>  >Not modern C, which puts each structure's members in a separate name space.
> 
> If so, most existing programs which use structures will not compile.

Why not?  Very few of them rely on the common name space.  It's been bad
practice to access one structure with another's member name all along.
We converted from a common-name-space compiler to a separate-name-space
compiler a little while ago.  Very few things broke; the ones which did
were generally doing obscene tricks with unions.  The old V7 kernel was
a problem area for that.

> I haven't seen any compilers that insist on members-of-structures
> being tied to a specific structure...

Then you are using old compilers, or new compilers written from old
definitions of C.  Modern System V compilers, for example, all implement
separate name spaces.  I think the Berklix ones may have been fixed
up to do likewise, although I'm not sure of that.

> ...  (What ever happened to
> the days when a standards committee codified existing practises
> rather than doing a re-engineering job?)

I do have some complaints about X3J11 in this regard, but you are barking
up the wrong tree this time.  There is considerable experience with this
particular practice, even if you aren't familiar with it.
-- 
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,decvax,pyramid}!utzoo!henry

karl@haddock.UUCP (Karl Heuer) (12/10/86)

In article <427@viper.UUCP> dave@viper.UUCP (David Messer) writes:
It is
>interesting in that the only thing gaurenteed about casts on pointers is the
>conversion to a long and back and conversion to a pointer to a smaller type
>and back, which also results in the same original pointer value.  One
>interesting thing is, by these rules, the use of malloc(), for anything other
>than char arrays, is non-portable.

No, it isn't.  The *implementation* of malloc() is nonportable, but since it
is guaranteed to return a maximally-aligned "char *" (the result of casting
a widetype pointer, which, as you mentioned, may be safely cast back), its
use is portable.  (Well, I guess if you want to nit-pick, the rules you quoted
don't specifically allow (int *)(char *)(double *)double_aligned_intptr; is
that what you mean?)

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

brett@wjvax.UUCP (12/11/86)

In article <509@brl-sem.ARPA> ron@brl-sem.ARPA (Ron Natalie <ron>) writes:
>In article <3810@watmath.UUCP>, rbutterworth@watmath.UUCP (Ray Butterworth) writes:
>> > According to K&R all
>> > some machines in which pointers to different types are unrelated
>> > in format.  In other words a cast such as (type1 *)(type2 *)x
>> > will not always give a meaningful answer.
>> True.  But I wasn't talking about (type1*), I was talking about
>> (char*).  As far as I know, any (type2*) can be cast into a (char*)
>> and back again without harm on any machine.  (otherwise how does
>> malloc() work?) And any two (char*) pointers can be subtracted to
>> produce the number of bytes between them.
>> 
>What malloc provides is a (char *) that can be cast into other pointer
>types.  There is no guarantee that all legitimate pointer values can
>be represented by a (char *) cast of them.  Malloc is one of those very
>machine dependant magic routines that is careful to return very "special"
>character pointers that can be used as other data types, but to imply that
>the compiler always allows character pointers to be used for this mode
>is naive.

Except for the case of function pointers, I can't think of any reason why
converting type * to char * and back should not be valid.  Char * is
guaranteed to be the least-aligned pointer (since chars have no alignment
restrictions), so as long as we are talking about normal data pointers,
casting to char * and back should cause no loss of information.

What of previous attempts to guarantee the use void * as such a universally
least-aligned pointer?
-- 
-------------
Brett Galloway
{pesnta,twg,ios,qubix,turtlevax,tymix,vecpyr,certes,isi}!wjvax!brett

meissner@dg_rtp.UUCP (Michael Meissner) (12/13/86)

In article <419@viper.UUCP> dave@viper.UUCP (David Messer) writes:
>
> Because your solution doesn't work all the time either.  There are
> some machines in which pointers to different types are unrelated
> in format.  In other words a cast such as (type1 *)(type2 *)x
> will not always give a meaningful answer.  According to K&R all
> that is required is that (type *)(long *)x == x.

I hate to be picky, but what K&R requires is that:

	(type *)(char *)x == x

work (ie, it will only work if the pointer type you are converting to is
of less strict alignment (or the same) and back again.  ANSI X3J11 goes
further, and mandates that function pointers are not allowed to be converted
to object pointers.  For word based machines (like the Data General MV-series
computers), this conversion may cause conversion, to/from the pointer type.

	Michael Meissner
	Data General
	...!mcnc!rti-sel!dg_rtp!meissner

mouse@mcgill-vision.UUCP (der Mouse) (12/21/86)

In article <7377@utzoo.UUCP>, henry@utzoo.UUCP (Henry Spencer) writes:
> Or perhaps you were thinking of (type *)(long)x == x ?  That would
> make me nervous but it is technically valid.

Is it?  I seem to remember something in K&R to this effect:

	A pointer may be converted to any integral type large enough to
	hold it.

This does not guarantee that there *is* any integral type large enough
to hold a pointer (any pointer).  On the other hand, they continue

	Whether an int or long is required is machine dependent.

thereby implying that at least one of (int,long) will be sufficient.
But they don't come right out and *say* so, do they?  Do H&S or X3J11
say anything about this?

					der Mouse

USA: {ihnp4,decvax,akgua,utzoo,etc}!utcsri!mcgill-vision!mouse
     think!mosart!mcgill-vision!mouse
Europe: mcvax!decvax!utcsri!mcgill-vision!mouse
ARPAnet: think!mosart!mcgill-vision!mouse@harvard.harvard.edu

henry@utzoo.UUCP (Henry Spencer) (12/23/86)

> > Or perhaps you were thinking of (type *)(long)x == x ?  That would
> > make me nervous but it is technically valid.
> 
> Is it?  [... Maybe long isn't long enough. ...]

This is why it makes me nervous.  I would expect trouble in particular
on machines with long and pointers the same size *except* that char
pointers are longer because the original pointer format didn't provide
addressing to the byte.
-- 
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,decvax,pyramid}!utzoo!henry