[comp.std.c] casting structs

kdb@chinet.chi.il.us (Karl Botts) (08/06/89)

Why is it utterly illegal to cast a struct (or a union) to anything?
Obviously, this would be a capability susceptible to abuse;
nevertheless, it is not in the "spirit of C" do ban something for that
reason only.  In fact there are cases -- particularly involving bit
structs, but it is possible to hypothesize case for other structs as well
-- when casting a struct seems perfectly reasoable.  For instance:

typedef struct X_T {
	unsigned	a : 1;
	unsigned b : 2;
	unsigned c : 3;
} x_t;

x_t x = { 1, 2, 5};
int i = (int)x;

seems quite sensible to me.  

Of course, you can acheive the desired effect here by replacing the last
line of the above with:

int i = *(int *)&x;

but this strikes me as unnecessary obfuscation (but no additional overhead,
at least on the compilers I have traced through the output of similar code
for -- they can figure out that no real pointer operations are required
here.)

What would be really nice would be a way to initialize an int, say, by
using a struct.  It is perfectly legal to initialize a value by casting
another type of value into the type desired, as in:

long l = (long)&foo;

where foo can be any lvalue.  So why isn't it legal for a non-scalar type?

The only reason I can come up with is that it won't fit into the syntax --
at least I can't come up with anything that seems sensibel and would work.
How would you wirte what I _really_ want to do in the first example?

int i = (int)x_t x = {1, 2, 5};

Whoops, that's no good (even if the compiler would accept it) -- it defines
a data object "x", just like the first example.  So lets leave x out:

int i = (int)x_t = {1, 2, 5};

Hey, wait a minute, maybe that _does_ work.  Suppose we stipulated that
initializing a type name, as opposed to an object of that type, produces an
rvalue of the specified type and value, rather than an lvalue.  This
wouldn't break any code, because initializeing a type name is simply
illegal now.

Hmmm..  The next question is, is the above an LALR parse?  I can't see any
reason why not, off the top of my head.  If the type name was compound, you
might have to put perens around it as in:

int i = (int)(struct X_T) = {1, 2, 5};

I must think about this...

chris@mimsy.UUCP (Chris Torek) (08/07/89)

In article <9184@chinet.chi.il.us> kdb@chinet.chi.il.us (Karl Botts) writes:
>Why is it utterly illegal to cast a struct (or a union) to anything?

The major reason is that casts are, in C, operators that take a single
input value, perform some transformation, and produce a single output
value.  It seems relatively simple to recursively apply such an
operator to an aggregate, so that one could, by casting an aggregate,
apply the same transformation to each element of that aggregate,
producing a new aggregate.  However, you seem to want something that
will apply some transform to an aggregate, yeilding a single value:

>... For instance:
>
>typedef struct X_T {
>	unsigned a : 1;
>	unsigned b : 2;
>	unsigned c : 3;
>} x_t;
>
>x_t x = { 1, 2, 5};
>int i = (int)x;
>
>seems quite sensible to me.  

To me, it seems sensible for (int)x to produce as its result an object
of type `array 3 of int' whose value is {1, 2, 5}.  C, however, does
not have any array rvalues now; this sort of change goes much deeper
than it might first appear.

Exactly what value you want from your transform is not clear to me,
nor do I know how one would go about defining this in a reasonably
efficient, yet machine-independent manner.

>Of course, you can acheive the desired effect here by replacing the last
>line of the above with:
>
>int i = *(int *)&x;
>
>but this strikes me as unnecessary obfuscation

The whole thing strikes me as rather obfuscated.  The value of `i'
produced by *(int *)&x is completely different on a VAX than on a
Tahoe, because the two compilers allocate bits in the opposite orders:

	[vax218] cc -o z z.c		# (this window has been around
	[vax219] ./z			# for a while . . .)
	i=45
		.
		.
		.
	[tahoe1] cc -o z z.c
	[tahoe2] ./z
	i=-738197504

Identical source code; I just added a `main' and a `printf' to your
example above.

Incidentally, since compilers are free to allocate bitfields in any
order, and there is no true `natural' order on 68020s, different
compilers for the same machine will yeild different results.

>(but no additional overhead, at least on the compilers I have traced
>through the output of similar code for -- they can figure out that no
>real pointer operations are required here.)

Indeed; all that is required is machine-dependent operation, and the
pointer cast makes it clear that something machine-dependent is happening.
(With a few exceptions, all pointer casts involve machine-dependent
transformations.)

>What would be really nice would be a way to initialize an int, say, by
>using a struct.

Why?
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

kdb@chinet.chi.il.us (Karl Botts) (08/08/89)

In article <18921@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
>>... For instance:
>>
>>typedef struct X_T {
>>	unsigned a : 1;
>>	unsigned b : 2;
>>	unsigned c : 3;
>>} x_t;
>>
>>x_t x = { 1, 2, 5};
>>int i = (int)x;
>>
>>seems quite sensible to me.  
>
>To me, it seems sensible for (int)x to produce as its result an object
>of type `array 3 of int' whose value is {1, 2, 5}.  C, however, does
>not have any array rvalues now; this sort of change goes much deeper
>than it might first appear.
>
>Exactly what value you want from your transform is not clear to me,
>nor do I know how one would go about defining this in a reasonably
>efficient, yet machine-independent manner.

The array business is beyond me -- there are no arrays in sight anywhere
here.  Obviously, what I want is a bit struct as will in fact be generated
by the perfectly legit line:

x_t x = { 1, 2, 5};

and an int which is a bitwise copy of the first sizeof(int) bytes of the
struct.  I made no request for any machine independence -- bit fields are
never machine independent, and while this may diminish their usefulness it
does not eliminate it.

>>What would be really nice would be a way to initialize an int, say, by
>>using a struct.
>
>Why?

In the first place, perhpas the most important principal of "the spirit of
C" is that the question is never "why?", but "why not?"  In the absence of
a compelling reason to the contrary, a programmer should never be
inhibited by the lack of omniscient foresight of the language comittee, or
protected from his own folly, by being prohibited from doing anything in C
that can be done in assembler language, including machine specific
assembler language.

In the second place, this question came to my mind not as mere speculation,
but because I had a problem.  I had occassion to initialize a large table
of structs, some of whose fields were ints which might or might not be
broken into bit fields, depnding on the values of other fields of the
table.  This was in connection with a parser for a peculiar language.  For
instance, some entries in the table represented function names, and some
represented operators -- the same integer field would contain a different
set of bit fields if the token was a function name than if it were an
operator name.  It would be most convenient, readable and maintainable to
initialize the table all in one place.  I do not care what particular
values the bit fields produce as an int, so long as I can get the bit
fields when I need them.

Now, if I make two bit field structs and make the integer field in the table
struct a union of them, I can only initiaize one of them.  I already had a bunch
of bit fiddling macros -- stuff like MID_BITS_SET(i, m n) and so forth.  So
I built some macros to refer to the bit fields -- I suspect this is the way
it used to be done, until somebody thought of bit fields.  It made the
table big and messy, but it worked -- until I tried to compile it!  It
seems tha macros were too big to expand, and that an arbitrary limit of 509
characters on the length of macro expansions is endorsed by the ANSI
standard!

So, I wanted to do what I said in my previous message -- initialize the
bitfields in special structs for the purpose, ancd cast them, all in the
same initializer, into an int (or whatever).  I ask again --why can't I do
this?

protected
from his