[net.lang.c] casts and puns

rcd@opus.UUCP (Dick Dunn) (11/07/84)

> ...  Do not be misled by the fact that some casts on some machines
> generate no code.  A cast is a *conversion*, not a type pun...

Unfortunately, not entirely true.  What C calls "casts" covers both
true casts (in, say, the ALGOL 68 sense) and puns, at least in practice.
The operation is a cast when conversion "makes sense" (in semantic terms);
otherwise it's a pun.  For example, if i is an integer, (float)i is clearly
a cast; it generates instructions on most machines and produces a different
internal datum which is the float corresponding most closely to the
integral value of i.  However, (long*)i is a pun; you just have to stretch
too far to make any sensible semantics which makes it a true cast.

> >example 'struct foo {struct foo *ref;};'...
> struct foo is not a recursive type.  At best you might call it an
> iterative type.  What would you think of struct foo {struct foo bar;}; ?

Please be careful when attacking other folks' terminology.  Structured
types which contain references to themselves are quite commonly called
"recursive".  (See, for example, Hoare's seminal paper "Notes on Data
Structuring".)

> >Also, Pascal, which is doubtlessly a strongly typed language, does
> >permit a type definition like 'type ref= ^ref;' and handles it
> >correctly...

Perhaps true, but marginally relevant to reality, since the only value
assignable to an object of type ref is NIL.
-- 
Dick Dunn	{hao,ucbvax,allegra}!nbires!rcd		(303)444-5710 x3086
   ...Never attribute to malice what can be adequately explained by stupidity.

geoff@desint.UUCP (Geoff Kuenning) (11/12/84)

In article <940@opus.UUCP> rcd@opus.UUCP (Dick Dunn) writes:

>However, (long*)i is a pun; you just have to stretch
>too far to make any sensible semantics which makes it a true cast.

Oh yeah?  How about non-byte-addressable machines?  A C compiler for the DG
Nova, for example, would have to generate a shift-right instruction for that
cast.  Sounds like a conversion to me.  The same goes for (long) i on most
machines;  you have to sign-extend it.

If I may leap to an unjustified generalization on insufficient data, I wonder
if there is such a thing as a type pun at all, at least in a machine-
independent sense.  On a Vax, for example, (int) i is a pun if i is a long,
but on a PDP-11 it is a truncating conversion (cast).  On most machines a
struct foo * is the same as a struct bar *, but the language doesn't require
it (though if they are different sizes I don't envy the compiler writer).
One can conceive of a machine where every structure had to be aligned on a
structure-size boundary;  it is then reasonable to shift pointers right an
appropriate number of bits.  (This makes the ++ operator easy to implement,
and can be pretty cheap if the machine has built-in left-shifting as part of
the indexing process, similar to the Vax's limited indexing capability.)
-- 

	Geoff Kuenning
	First Systems Corporation
	...!ihnp4!trwrb!desint!geoff

hugh@hcrvax.UUCP (Hugh Redelmeier) (11/13/84)

> > >Also, Pascal, which is doubtlessly a strongly typed language, does
> > >permit a type definition like 'type ref= ^ref;' and handles it
> > >correctly...

> Perhaps true, but marginally relevant to reality, since the only value
> assignable to an object of type ref is NIL.

This turns out not to be the case.  Type ref is a suitable representation
for integers.  If you remember Peano, here are some implementations:

constant zero = nil;
function successor(n: ref): ref; begin new(successor); successor^ := n end;
function predecessor(n: ref): ref; begin predecessor := n^ end

breuel@harvard.ARPA (Thomas M. Breuel) (11/14/84)

|>However, (long*)i is a pun; you just have to stretch
|>too far to make any sensible semantics which makes it a true cast.
|
|Oh yeah?  How about non-byte-addressable machines?  A C compiler for the DG
|Nova, for example, would have to generate a shift-right instruction for that
|cast.  Sounds like a conversion to me.  The same goes for (long) i on most
|machines;  you have to sign-extend it.
|
|If I may leap to an unjustified generalization on insufficient data, I wonder
|if there is such a thing as a type pun at all, at least in a machine-
|independent sense.  On a Vax, for example, (int) i is a pun if i is a long,
|but on a PDP-11 it is a truncating conversion (cast).  On most machines a
|struct foo * is the same as a struct bar *, but the language doesn't require
|it (though if they are different sizes I don't envy the compiler writer).
|One can conceive of a machine where every structure had to be aligned on a
|structure-size boundary;  it is then reasonable to shift pointers right an
|appropriate number of bits.  (This makes the ++ operator easy to implement,
|and can be pretty cheap if the machine has built-in left-shifting as part of
|the indexing process, similar to the Vax's limited indexing capability.)
|-- 
|
|	Geoff Kuenning
|	First Systems Corporation
|	...!ihnp4!trwrb!desint!geoff
|

Nice if you can make sense of pointer casts. In general that is not
possible, however. Assume, for example, that you are working on a 36
bit machine and that you are storing four 8bit characters per word. A
pointer to a character would then consist of the address of the word
plus two bits telling which character in the word is the one you are
pointing at. What sense would you make of assigning a 36bit word to 
contents of a character pointer into the middle of another 36bit word?
Would you fill up the unused top 4bits?

A related problem occurs on 68000's, where word operations have to be
word aligned in memory. If you have a pointer p to a character,
(short *)p = 0; will give you an addressing error half of the time.
Should the compiler have word-aligned the pointer in the cast, or should
you let the addressing error happen?

I don't think that pointer casts/puns with non-word pointers (where
word refers to the smallest addressable unit) work very well, and that
you can make any machine independent sense of it.

						Thomas.
						(breuel@harvard)

kpmartin@watmath.UUCP (Kevin Martin) (11/17/84)

In article <210@desint.UUCP> geoff@desint.UUCP (Geoff Kuenning) writes:
>In article <940@opus.UUCP> rcd@opus.UUCP (Dick Dunn) writes:
>
>>However, (long*)i is a pun; you just have to stretch
>>too far to make any sensible semantics which makes it a true cast.
>
>Oh yeah?  How about non-byte-addressable machines?  A C compiler for the DG
>Nova, for example, would have to generate a shift-right instruction for that
>cast.  Sounds like a conversion to me.

   (Throw glass of cold water in person's face, grab then by the shoulders
and shake them, slap them in the face a few times, look them squarely in
the face, and, for the n+1st time, say...)

THERE IS NO RULE THAT SAYS THAT (long *)i MUST POINT TO THE i'TH long
OR TO THE i'TH BYTE
OR TO THE i'TH ANYTHING.

It just happens to do this on pdp11's, vaxes, 68000's, 8086's, etc.

On a NOVA, (int *)1 could just as well point at word 1, and (char *)1
as the lower byte of word 1, and (char *)0100001 at the upper byte of
word 1, etc. In which case the cast really is a pun.

If a C compiler for the NOVA does a shift for int->pointer conversions,
it is only through the kindness of heart (and/or lack of brains) on the part
of the compiler developers. It is tough enough making sure zero turns into
the machine's null pointer without having to make every integer point to
the corresponding byte too.
                   Kevin Martin, UofW Software Development Group.

geoff@desint.UUCP (Geoff Kuenning) (11/20/84)

In article <9890@watmath.UUCP> kpmartin@watmath.UUCP (Kevin Martin) writes:

>In article <210@desint.UUCP> geoff@desint.UUCP (Geoff Kuenning) writes:
>>
>>Oh yeah?  How about non-byte-addressable machines?  A C compiler for the DG
>>Nova, for example, would have to generate a shift-right instruction for that
>>cast.  Sounds like a conversion to me.
>
>   (Throw glass of cold water in person's face, grab then by the shoulders
>and shake them, slap them in the face a few times, look them squarely in
>the face, and, for the n+1st time, say...)
>
>On a NOVA, (int *)1 could just as well point at word 1, and (char *)1
>as the lower byte of word 1, and (char *)0100001 at the upper byte of
>word 1, etc. In which case the cast really is a pun.
>
>If a C compiler for the NOVA does a shift for int->pointer conversions,
>it is only through the kindness of heart (and/or lack of brains) on the part
>of the compiler developers. It is tough enough making sure zero turns into
>the machine's null pointer without having to make every integer point to
>the corresponding byte too.

Before you go around slapping people and throwing cold water in their face,
Kevin, you should be sure what you are doing.  Otherwise you are going to
offend people.  In this case, you have exhibited a fair amount of ignorance
regarding the Nova, ignored statements in the latter part of the article,
and accused a number of compiler writers who you have never met of a lack
of brains.  For your information, there are two ways one can do byte
pointers on the Nova.  One is to store an index of the i'th byte from
location zero.  The other is to use bit 15 as the byte pointer.  But, with
the Nova instruction set, testing bit 15 requires a shift.  And bit 15 must
be cleared before you access the word you want, because it is an indirect
bit.  If you store a "byte pointer", you can do the test, adjustment, and
clearing of bit 15 in a single instruction.  If you store the odd-byte bit
in the high bit, not only is your heavily-used byte-get routine significantly
slower, but you also have a really nasty problem with incrementing the things.

Obviously, word pointers should be stored as word pointers for efficiency
reasons.  Thus, you have a machine where the optimal pointer representation
differs depending on the type.

'Nuff said?
-- 

	Geoff Kuenning
	First Systems Corporation
	...!ihnp4!trwrb!desint!geoff

kpmartin@watmath.UUCP (Kevin Martin) (11/24/84)

>
>Before you go around slapping people and throwing cold water in their face,
>Kevin, you should be sure what you are doing.  Otherwise you are going to
>offend people.
>	Geoff Kuenning
I'm sorry if I seemed insulting. However, I get very frustrated at this
misconception that a pointer *has* to contain a byte index from location
zero.  The slapping etc. was not aimed at you in particular.

Having not used a NOVA for quite a while, my memory has become
somewhat rusty. Had I tried the code bursts on a piece of paper
first, I would have realized that they don't work. Please don't
mistake a short lapse of thought for a general ignorance of
the situation.

               (apologetically) Kevin Martin, UofW Software Development Group

jim@ISM780B.UUCP (11/28/84)

>/* Written  5:39 am  Nov  7, 1984 by rcd@opus in ISM780B:net.lang.c */
>/* ---------- "casts and puns (and digressions...)" ---------- */
>> ...  Do not be misled by the fact that some casts on some machines
>> generate no code.  A cast is a *conversion*, not a type pun...
>
>Unfortunately, not entirely true.  What C calls "casts" covers both
>true casts (in, say, the ALGOL 68 sense) and puns, at least in practice.
>The operation is a cast when conversion "makes sense" (in semantic terms);
>otherwise it's a pun.  For example, if i is an integer, (float)i is clearly
>a cast; it generates instructions on most machines and produces a different
>internal datum which is the float corresponding most closely to the
>integral value of i.  However, (long*)i is a pun; you just have to stretch
>too far to make any sensible semantics which makes it a true cast.

I quite disagree; even a bit-for-bit identity can be considered a conversion.
And there is no cast that might not generate code on some machine,
even pointer to integer or integer to pointer, since it may require
widening or truncation.  And, you are not even guaranteed a bit-for-bit
equivalence by the language spec (K&R, 14.4), only an "unsurprising" and
reversible mapping if the integer is large enough to hold the pointer.
Any assumption based on the notion of puns is likely to lead you
down the road to non-portability.

>> >example 'struct foo {struct foo *ref;};'...
>> struct foo is not a recursive type.  At best you might call it an
>> iterative type.  What would you think of struct foo {struct foo bar;}; ?
>
>Please be careful when attacking other folks' terminology.  Structured
>types which contain references to themselves are quite commonly called
>"recursive".  (See, for example, Hoare's seminal paper "Notes on Data
>Structuring".)

Chastisement accepted.

-- Jim Balter, INTERACTIVE Systems (ima!jim)