[comp.std.c++] Conversions to/from void*, redux

chip@tct.uucp (Chip Salzenberg) (03/04/91)

[ Since were's talking about What Should Be, followups to comp.std.c++ ]

According to jimad@microsoft.UUCP (Jim ADCOCK):
>In article <27C95508.15D0@tct.uucp> chip@tct.uucp (Chip Salzenberg) writes:
>|According to jimad@microsoft.UUCP (Jim ADCOCK):
>|>If C++ is implemented on top of "object-oriented" architectures,
>|>it may well be that pointers to primitive types have a totally
>|>different representation that pointers to "Object" types.
>|
>|This kind of distinction, while conceptually possible, would seem to
>|rule out a user's writing "operator new" and "operator delete" ...
>
>No.  A C++ compiler in such a situation can accept definitions of
>operator new and operator delete with the void*, but in generating
>actual code return the true pointer type -- in actual application,
>new returns a pointer to an actual object type, not a void*.

Any compiler that is capable of correct behavior in the presence of a
user-defined "operator new" member function is in fact capable of
converting a |void*| to an |Object*|.  (Remember that the actual
function "Object::operator new()" is user-written code that does
arbitrary pointer things and actually _does_ generate a |void*|.)

If that same compiler is unable to support such a cast in normal code,
it is Brain Dead and not worth the floppy it's distributed on.

(Before disagreeing, remember that I am not asking for any
transformation except for |T*|->|void*|->|T*|.  In other word, you
have to convert back to the exact same type, or all bets are off.)

>|>A void* might then be considered roughly equivalent to a char* ...
>|
>|Exactly equivalent, if compatibility with C libraries matters.
>
>Since when did C libraries use operator new and delete ???

This point is unrelated to "operator new".

As has been mentioned before, ANSI C requires that |void*| and |char*|
have the same representation.  If a C++ implementation does not
conform to this requirement, then compatibility with C libraries is
compromised.  That may or may not matter to you, of course, but it
matters to me.

>.. an "object oriented" CPU might go with a object#:member-inside-object#
>pointer representation ...  A |void*| may represent a conversion of such
>a object:member pair into an underlying "hardwired" address of such a
>machine's random access memory.  Inverting such an address back into
>object:member pair might be most prohibitive.

Prohibitive, but not impossible.  If it were impossible, then
user-written "operator new" and "operator delete" wouldn't work.

>So, I believe people who support free-wheeling pointer-type hacking have
>a Un*x-like processor model in mind.  They can't understand why some pointer
>-type hacks *ought* to be prohibited -- since such pointer-type hacking is
>*obviously* mearly conceptual.

That's an unnecessarily pejoritive statement.  I know word-addressed
machines, for example, in which "sizeof(int*)" and "sizeof(void*)"
differ.  I know the Cray (from a distance :-)) which has no integer
arithmetic.  I know about Burroughs/Unisys (?) machines on which
function pointers are several words long.  I know a little about
capability machines.  Etc.

My point is that the C++ language itself is tied to fairly traditional
architectures unless and until user-written "operator new" and
"operator delete" member functions are removed or made optional.

>Its not *obvious* to me at all that we aren't signing C++ death warrent some
>years down the line if "object-oriented" CPUs start catching on [like they
>*ought* to :-]

I suspect that Smalltalk is a better match for OOCPUs than C++ or
Objective C.  And no language can live forever, anyway.

>Thus, I'd like to see C++ be pretty conservative about what kinds of 
>type-hacking conforming compilers are *required* to support.

Outlaw "operator new" and "operator delete" and you'll get your wish.

jimad@microsoft.UUCP (Jim ADCOCK) (03/05/91)

In article <27D18D22.1608@tct.uucp> chip@tct.uucp (Chip Salzenberg) writes:
|[ Since were's talking about What Should Be, followups to comp.std.c++ ]
|
|According to jimad@microsoft.UUCP (Jim ADCOCK):
|>In article <27C95508.15D0@tct.uucp> chip@tct.uucp (Chip Salzenberg) writes:
|>|According to jimad@microsoft.UUCP (Jim ADCOCK):
|>|>If C++ is implemented on top of "object-oriented" architectures,
|>|>it may well be that pointers to primitive types have a totally
|>|>different representation that pointers to "Object" types.
|>|
|>|This kind of distinction, while conceptually possible, would seem to
|>|rule out a user's writing "operator new" and "operator delete" ...

I disagree -- and for the same reasons.  Operator new and delete are
*special* functions.  Even if specified by the user, there is no 
reason why a compiler couldn't generate two versions of operator new
and delete from the user specification.  For example, take the Cray.
A Cray C++ compiler might want to automatically generate two versions
of new and delete from the user specifications.  One version would assume
"byte" alignment, and the other word alignment.  The compiler would be
smart enough to intuit the right new/delete based on the type of the
object involved.

|If that same compiler is unable to support such a cast in normal code,
|it is Brain Dead and not worth the floppy it's distributed on.

I disagree and counter-claim that it is any programmer who insists on 
doing such pointer hacks who is brain-dead.

|(Before disagreeing, remember that I am not asking for any
|transformation except for |T*|->|void*|->|T*|.  In other word, you
|have to convert back to the exact same type, or all bets are off.)

I remember, and I still disagree.

|As has been mentioned before, ANSI C requires that |void*| and |char*|
|have the same representation.  If a C++ implementation does not
|conform to this requirement, then compatibility with C libraries is
|compromised.  That may or may not matter to you, of course, but it
|matters to me.

It is acceptible to me that void* and char* have "the same representation"
[whatever that means -- hopefully the ANSI-C committee were not trying
to specify *implementation* choices] -- as long as C++ then does not
*require* that it be possible to convert from char* to class X*.  In 
which case, both char* and void* might refer to a physical location in
RAM, and both char* and void* might be difficult or impossible to convert
back to a class X*.  

chip@tct.uucp (Chip Salzenberg) (03/07/91)

According to jimad@microsoft.UUCP (Jim ADCOCK):
>Operator new and delete are *special* functions.  Even if specified by the
>user, there is no reason why a compiler couldn't generate two versions of
>operator new and delete from the user specification.

I see no reason to set "operator new" and "operator delete" aside into
a set of "special" functions.  They are "special" only in that the
compiler calls them in contexts where their names are not explicitly
mentioned, and in C++ that's not anything special.  They're just plain
functions: you can take their address and call them indirectly, etc.

>A Cray C++ compiler might want to automatically generate two versions
>of new and delete from the user specifications.  One version would assume
>"byte" alignment, and the other word alignment.  The compiler would be
>smart enough to intuit the right new/delete based on the type of the
>object involved.

First, I'd like to know what, _exactly_, the compiler would do
differently when in the two compilations of "operator new".  To me,
such a double compilation seems wasteful and useless.

But let's assume that such an implementation existed, and let's assume
that there's a good reason for double compilation.  Now, put yourself
in the shoes of the implementor, and answer me this question: "If you
can compile my one function two ways, thus adapting for alignment
restrictions, why can't you make other casts of |void*| to |T*| work
by using the same trick?"

>In article <27D18D22.1608@tct.uucp> chip@tct.uucp (Chip Salzenberg) writes:
>|If that same compiler is unable to support such a cast in normal code,
>|it is Brain Dead and not worth the floppy it's distributed on.
>
>I disagree and counter-claim that it is any programmer who insists on 
>doing such pointer hacks who is brain-dead.

It's not so much that I insist on _doing_ such pointer conversions; if
they were non-portable, I would avoid them.  What I insist is that
there be a recognition that the ARM already implicitly constrains
implementations to be capable of them, unless the implementor is just
being contrary.

>|As has been mentioned before, ANSI C requires that |void*| and |char*|
>|have the same representation.  If ANSI C++ does not conform to this
>|requirement, then compatibility with C libraries is compromised.
>|That may or may not matter to you, of course, but it matters to me.
>
>It is acceptible to me that void* and char* have "the same representation"
>[whatever that means -- hopefully the ANSI-C committee were not trying
>to specify *implementation* choices]

In fact, they _do_ specify this particular implementation choice.  In
particular, the following code must output "yes" in an ANSI C
environment:

    #include <stdio.h>
    #include <assert.h>
    #include <string.h>
    main()
    {
        char c;
        char *cp = &c;
        void *vp = &c;
        assert(sizeof(cp) == sizeof(vp));
        puts(memcmp(&cp, &vp, sizeof(cp)) == 0 ? "yes" : "no");
        exit(0);
    }

Since ANSI C++ will be based on ANSI C, I find myself hard-pressed to
imagine a rationale for breaking it under ANSI C++.

> -- as long as C++ then does not *require* that it be possible to convert
>from char* to class X*.

But it already does require that conversino to be possible!  Remember
that the definition of "object" is "region of storage" (section 3),
and read from ARM section 5.4, with my comments in [brackets]:

    It is guaranteed that a pointer to an object of a given size
    [such as class T] may be converted to a pointer to an object
    of the same or smaller size [such as char] and back again
    without change.

QED.
-- 
Chip Salzenberg at Teltronics/TCT     <chip@tct.uucp>, <uunet!pdn!tct!chip>
   "All this is conjecture of course, since I *only* post in the nude.
    Nothing comes between me and my t.b.  Nothing."   -- Bill Coderre

dhoyt@vx.acs.umn.edu (DAVID HOYT) (03/08/91)

In article <27D5708A.29CF@tct.uucp>, chip@tct.uucp (Chip Salzenberg) writes...
> ... about void* equivalent to char*
>Since ANSI C++ will be based on ANSI C, I find myself hard-pressed to
>imagine a rationale for breaking it under ANSI C++.

  The reason that ANSI made this true (mostly, anyway) is because there
existed a huge body of code that depended on this equivalence.  Remember
K&R didn't have a void type in the first place.  This forces word machines,
such as the Cray's, either 1) always create slow code, or 2) supply a
compiler switch to produce slow (but compatable) code or create code
that is fast, but will break some code that makes stupid assumptions
about void* == char*.

  But with C++ this is not the case.  Anyone who has used a char* when
a void* was needed was in no simple terms, stupid.  I would like to think
that the grand majority of code out there in C++ doesn't make this mistake.
So there is no good reason to support this equivalence in C++.  The standard
should declare such code to be buggy and let the language implementor's the
freedom to produce the best possible code for their target machines.

  But what do I know?  -- david | dhoyt@vx.acs.umn.edu

chip@tct.uucp (Chip Salzenberg) (03/09/91)

According to dhoyt@vx.acs.umn.edu:
>In article <27D5708A.29CF@tct.uucp>, chip@tct.uucp (Chip Salzenberg) writes...
>>[ANSI C requires |void*| and |char*| to be identically represented]
>
>The reason that ANSI made this true (mostly, anyway) is because there
>existed a huge body of code that depended on this equivalence.

This statement is patently false, as proven by the next statement,
which is true:

>K&R didn't have a void type in the first place

It is impossible for K&R programs to depend on |void*|, since they
don't mention |void*|.

Point 1: ANSI created |void*| explicitly for the purpose of holding
any data address whatsoever.  That's why, for example, the first two
arguments to memcpy() are both of type |void*|.  ANSI C++ cannot
change this most basic characteristic of |void*| without breaking
compatibility with many ANSI C programs and large portions of the ANSI
C library.

Point 2: The ARM promises that a |char*| may hold any data address
whatever.  This is guaranteed in the section I quoted in the
referenced article, which asserts that a pointer to any object may be
cast to a pointer to any object of equal or smaller type, and then
back to its original type, without problem.

Conclusion: Since |void*| and |char*| must both be able to hold any
address, the C requirement that they have identical representations is
entirely reasonable for ANSI C++.

>This forces word machines, such as the Cray's, either 1) always create
>slow code ...

Please describe -- exactly -- how |void*|-|char*| equivalence forces
machines with word addressing into generating "slow code".

(Of course, on word-addressed machines, a |T*| is typically only one
word, while a |void*| is larger, to specify the exact byte within the
addressed word.  But that would always be true, regardless of the
implementation of |char*|.)

>But what do I know?

An excellent question.
-- 
Chip Salzenberg at Teltronics/TCT     <chip@tct.uucp>, <uunet!pdn!tct!chip>
 "Most of my code is written by myself.  That is why so little gets done."
                 -- Herman "HLLs will never fly" Rubin

dhoyt@vx.acs.umn.edu (DAVID HOYT) (03/09/91)

In article <27D8437B.F13@tct.uucp>, chip@tct.uucp (Chip Salzenberg) writes...
>According to dhoyt@vx.acs.umn.edu:
>>The reason that ANSI made this true (mostly, anyway) is because there
>>existed a huge body of code that depended on this equivalence.
> 
>This statement is patently false, as proven by the next statement,
>which is true:
  You missed my point.  The only reason that you can point to any object
with a char*, and sizeof( char ) == 1, is because K&R didn't have void.

>Point 1: ANSI created |void*| explicitly for the purpose of holding
>any data address whatsoever.  That's why, for example, the first two
>arguments to memcpy() are both of type |void*|.  ANSI C++ cannot
>change this most basic characteristic of |void*| without breaking
>compatibility with many ANSI C programs and large portions of the ANSI
>C library.
  C++ is not C.  C++ breaks many compatibilty rules for Fortran.  So what?
They are different languages, even if they both have a common root in
Von Neumann (sp?).  C++ programs that are written as if the language were
C are pretty much the same as C programs written as if they are Fortran
programs: silly.  Are you using char*'s where void*'s are needed, either
in ANSI C or in C++?  I doubt it.  Then why should anyone care if char*
can not point to any object?  Also it is not quite true that char* can
point to anything; they can not point to functions in ANSI C.  The
char* == void* equivalence in ANSI C is only there because there is a large
body of code that uses char* to point to anything.  Void*'s can also point
to anything.  char* == a, void* == a, void* == char*.  Blech.

>Please describe -- exactly -- how |void*|-|char*| equivalence forces
>machines with word addressing into generating "slow code".
  I suppose you are right here.  I was thinking that void*'s would be
optimised by placing them in 24 bit address registers, if they were known
to point to aligned structures.  There is of course no reason that could not
be couldn't be done for char*'s too.  As you might have guessed, I think
that char*'s should point to chars and void*'s should point to a generic
object.  If that distinction is allowed in the language the cray compiler
writers can say "No char*'s?  Great, everything is aligned.  Don't even
bother to check bits 64:62 for an offset in a word."

>(Of course, on word-addressed machines, a |T*| is typically only one
>word, while a |void*| is larger, to specify the exact byte within the
>addressed word.  But that would always be true, regardless of the
>implementation of |char*|.)
  On Cray's word addressed machines, address registers are 24 or 32 bits.
There is no such thing as a nonaligned address.  By convention the top three
bits are used as the offset of a character within a word.  If something
can be guarenteed to be aligned a pointer can be stored in an address
register, otherwise it must be stored in memory, which is slow, or a data
register, of which there are few.

   I'm just bitter because a bad design feature in the original C language
causes me headaches when I port to Cray's.  No code I write, nor those
whom I respect writes code that uses char*'s when void*'s are needed.  Nor
do they assume sizeof( char[ 1 ] ) == 1 != sizeof( char[ 2 ] ), etc.
This makes it relatively easy to port code to Crays.  But the Cray
compilers still have extra bagage in terms of complexity that is not needed.


>>But what do I know?
> 
>An excellent question.
 Thanks.

  david | dhoyt@vx.acs.umn.edu

chip@tct.uucp (Chip Salzenberg) (03/12/91)

According to dhoyt@vx.acs.umn.edu:
>The only reason that you can point to any object with a char*, and
>sizeof( char ) == 1, is because K&R didn't have void.

You mean, it didn't have |void*|.  True enough.  But as this point
addresses the reasons for the language rules, instead of the rules
themselves and their implications, it is irrelevant.

>In article <27D8437B.F13@tct.uucp>, chip@tct.uucp (Chip Salzenberg) writes:
>>Point 1: ANSI created |void*| explicitly for the purpose of holding
>>any data address whatsoever.  ...  ANSI C++ cannot change this most
>>basic characteristic of |void*| without breaking compatibility with
>>many ANSI C programs and large portions of the ANSI C library.
>
>C++ is not C.  C++ breaks many compatibilty rules for Fortran.  So what?

Please re-read the key phrase of the above-quoted paragraph:

   "... without breaking compatibility with many ANSI C programs
    and large portions of the ANSI C library."

You may not care about compatibility with ANSI C.  Bjarne obviously
cared, or he wouldn't have made C++ an almost-exact superset of C.  I
care too.  I am therefore glad that the ANSI C++ committee has already
decided that where the ARM does not specify a language feature, the
ANSI C standard shall hold sway.  (That's an inexact description, but
the gist is true.)

>Are you using char*'s where void*'s are needed, either in ANSI C or in
>C++?  I doubt it.

Implicitly, sure I am.  Calling "char *p = new char[n]" is the only
way in C++ to get a block of uninitialized bytes, such as for use in
the smart array class that is currently at work in real code.

>Also it is not quite true that char* can point to anything; they can
>not point to functions in ANSI C.

My article, which you quoted above (apparently without reading it),
said in part:

    "ANSI created |void*| explicitly for the purpose of holding
     any data address whatsoever."
         ^^^^

>char* == void* equivalence in ANSI C is only there because there is a large
>body of code that uses char* to point to anything.  Void*'s can also point
>to anything.  char* == a, void* == a, void* == char*.  Blech.

Maybe you don't like it, but that's the way it is according to ANSI.
And it's not going to change.  Use it or leave it.

>I think that char*'s should point to chars and void*'s should point to a
>generic object.

ANSI guarantees that "sizeof(char) == 1".  Therefore, the sizes of all
objects must of necessity be multiples of the size of a char.  This
fact leads to the unavoidable conclusion that a |char*| must be able
to hold the address of any object.  In other words, |char*| is a
generic pointer, just like |void*|.

>If that distinction is allowed in the language the cray compiler
>writers can say "No char*'s?  Great, everything is aligned.  Don't even
>bother to check bits 64:62 for an offset in a word."

<chuckle>  What if the |void*| points at the second char in an array?

>No code I write, nor those whom I respect writes code that uses char*'s
>when void*'s are needed.  Nor do they assume sizeof( char[ 1 ] ) == 1
> != sizeof( char[ 2 ] ), etc.

Then you're not programming in ANSI C, which guarantees all the things
you write above.  No wonder you have so many misconceptions about the
language.
-- 
Chip Salzenberg at Teltronics/TCT     <chip@tct.uucp>, <uunet!pdn!tct!chip>
 "Most of my code is written by myself.  That is why so little gets done."
                 -- Herman "HLLs will never fly" Rubin

jimad@microsoft.UUCP (Jim ADCOCK) (03/12/91)

In article <27D5708A.29CF@tct.uucp> chip@tct.uucp (Chip Salzenberg) writes:
|According to jimad@microsoft.UUCP (Jim ADCOCK):
|>Operator new and delete are *special* functions.  Even if specified by the
|>user, there is no reason why a compiler couldn't generate two versions of
|>operator new and delete from the user specification.
|
|I see no reason to set "operator new" and "operator delete" aside into
|a set of "special" functions.  They are "special" only in that the
|compiler calls them in contexts where their names are not explicitly
|mentioned, and in C++ that's not anything special.  They're just plain
|functions: you can take their address and call them indirectly, etc.

The reason to make them "really special" [they are already special -- 
see ARM page 261] is for the same reason that constructors and destructors
are special -- that implementations can reasonably be expected to 
exist that don't use normal function calls for implementing such things.

I'm proposing as a general rule that programmers not be able to take the
address of any special function directly.  If they need to take the address
of special functions, then they can explicitly lay down a normal function
that invokes the special function, and take the address of the normal
function.

Quote ARM Page 265:

"The reason one cannot take the address of a constructor is that constructors
have semantics that are closely tied to the semantics of memory allocation in
all its varieties ...."

....Sounds to me like an equally valid reason to not allow taking the address
of new/delete.

|But let's assume that such an implementation existed, and let's assume
|that there's a good reason for double compilation.  Now, put yourself
|in the shoes of the implementor, and answer me this question: "If you
|can compile my one function two ways, thus adapting for alignment
|restrictions, why can't you make other casts of |void*| to |T*| work
|by using the same trick?"

Because to do so assumes that the method required for going from a T* to a 
void* is easily invertable.  If it is not easily invertable, then one cannot 
expect compilers/implementations to do it.

|It's not so much that I insist on _doing_ such pointer conversions; if
|they were non-portable, I would avoid them.  What I insist is that
|there be a recognition that the ARM already implicitly constrains
|implementations to be capable of them, unless the implementor is just
|being contrary.

If you do not _insist_ on _doing_ such pointer conversions, then there is
no reason to _require_ all C++ compilers to support them.  Rather, on 
un*x-like systems where such conversions are easy to do, compiler can
continue to support them, and on non-un*x-like systems where such conversions
are difficult, compilers could choose not to support them.  Thus, conversion
from primitive type to non-primitive type would become implementation-
dependent.  This wouldn't break any existing code [which runs on un*x-like 
machines]

|>|As has been mentioned before, ANSI C requires that |void*| and |char*|
|>|have the same representation.  If ANSI C++ does not conform to this
|>|requirement, then compatibility with C libraries is compromised.
|>|That may or may not matter to you, of course, but it matters to me.

This is rightly a quality-of-implementation issue.  Vendors who have an
existing "Language-X" implementation that they wish to be compatible with,
can choose to implement C++ to be compatible with that implementation.
Vendors who do not have an existing "Language-X"  implementation that 
they want to be compatible with should not have to be so restricted.  
Why should the ANSI-C++ committee "mandate" conformance with any
particular "Language-X"?  What would it *mean* to be "Language-X" conformant
on a system that doesn't even *support* "Language-X" ?

|>It is acceptible to me that void* and char* have "the same representation"
|>[whatever that means -- hopefully the ANSI-C committee were not trying
|>to specify *implementation* choices]
|
|In fact, they _do_ specify this particular implementation choice.  In
|particular, the following code must output "yes" in an ANSI C
|environment:
|
|    #include <stdio.h>
|    #include <assert.h>
|    #include <string.h>
|    main()
|    {
|        char c;
|        char *cp = &c;
|        void *vp = &c;
|        assert(sizeof(cp) == sizeof(vp));
|        puts(memcmp(&cp, &vp, sizeof(cp)) == 0 ? "yes" : "no");
|        exit(0);
|    }
|
|Since ANSI C++ will be based on ANSI C, I find myself hard-pressed to
|imagine a rationale for breaking it under ANSI C++.

Imagine this rationale:  The ANSI-C++ committee see the light and chooses
not to follow the ANSI-C mistake of trying to specify *implementation,*
but rather restricts themself to specifying *language*.

What does it mean to the ANSI-C *Language* to say that void*'s and char*'s
have the "same representation" ???  I claim: nothing.  Likewise, for
example, I claim it means nothing to the C++ *Language* to say that
member addresses within a labeled section must have ordered addresses --
there's no way to write a strictly conforming program that makes use of
this "feature."

|> -- as long as C++ then does not *require* that it be possible to convert
|>from char* to class X*.
|
|But it already does require that conversino to be possible!  Remember
|that the definition of "object" is "region of storage" (section 3),
|and read from ARM section 5.4, with my comments in [brackets]:
|
|    It is guaranteed that a pointer to an object of a given size
|    [such as class T] may be converted to a pointer to an object
|    of the same or smaller size [such as char] and back again
|    without change.

I know what it says.  What *I* said is that I would support the idea
of requiring void* and char* to have the same representation iff it is 
not *required* to be able to convert from char* to class X*.  In general,
I do not think C++ should *require* that pointers to primitive objects
be convertable to pointers to objects of class type.  This in no way
prevents un*x-like C++ compilers from supporting such casting.  It
just leaves the door open for non-un*x-like implementations to exist.

jimad@microsoft.UUCP (Jim ADCOCK) (03/12/91)

In article <3568@ux.acs.umn.edu> dhoyt@vx.acs.umn.edu writes:
|In article <27D5708A.29CF@tct.uucp>, chip@tct.uucp (Chip Salzenberg) writes...
|> ... about void* equivalent to char*

Actually, about conversions from pointers to primitive to pointers 
from non-primitives.  Also, relating to present discussions about 
run-time type testing:

Consider a C++ implementation / CPU pair that uses machine "RAM" addresses as
the representation of "primitive" pointers such as void*, char*, int*,
"C" struct*, etc, but which uses tagged pointers for the representation
of [at least] pointers/references to objects of classes with virtual functions.

I claim C++ *ought to* allow such implementations.  Such implementations 
could be interesting *at least* on lisp-machines, Rekursiv-like "OO"
architectures, and on risc machines with 48-bit pointers.  On at least
some such machines, the tagged pointers might have 48-bit representations,
the primitive pointers have 32-bit representations.  The 48-bit pointers
would be type-tag + ram-address, whereas the 32-bit pointers would be
ram-address only.  Thus conversion from class X* to char* [for example]
really would represent loss of information, that would be difficult or
impossible to recover.  -- Note: due to polymorphism, the tag part of
a class X* does not necessarily correspond to "type X" -- it might
correspond to type class XD, where XD is a class derived from class X.
Note that in pointer-tagged implementations, there would not typically
be a redundant type-tag within the object structure itself -- the whole
point of moving the tag to the pointer is to make dispatch faster, by avoiding
an indirection.

Thus I claim C++ implementations should not be *required* to support 
back-casting from pointer-to-primitive to pointer-to-non-primitive.
Such back-casting should be considered strictly implementation dependent.

Comments?  

Should the C++ language be specified in such a way so as to prohibit, 
for all time, any such implementations/CPUs ???

chip@tct.uucp (Chip Salzenberg) (03/15/91)

According to jimad@microsoft.UUCP (Jim ADCOCK):
>I claim C++ *ought to* allow such implementations [that distinguish
>between primitive types and user-defined types].  Such implementations 
>could be interesting *at least* on lisp-machines, Rekursiv-like "OO"
>architectures, and on risc machines with 48-bit pointers.

Agreed on the "interesting" count, except for the last: RISC machines
are quite well cared for by the current language definition.

I must admit that it is not until now that I realize the purpose of
Jim's statements.  He does not speak to what C++ *is* (the subject on
which I have concentrated), but of what it *should be* (in his
opinion, of course).  I apologize to all readers, especially Jim, for
previous articles which may have been unnecessarily critical due to my
not having grasped this point.

>On at least some such machines, the tagged pointers might have 48-bit
>representations, the primitive pointers have 32-bit representations.
>The 48-bit pointers would be type-tag + ram-address, whereas the 32-bit
>pointers would be ram-address only.  Thus conversion from class X* to
>char* [for example] really would represent loss of information ...

Ah, a concrete example!  Thank you; it is quite helpful.  It sounds
very interesting, too, with a lot of potential for error checking,
among other things.  I'll keep it in mind.

>Should the C++ language be specified in such a way so as to prohibit, 
>for all time, any such implementations/CPUs ???

It is a truism that no language can be all things to all people.
-- 
Chip Salzenberg at Teltronics/TCT     <chip@tct.uucp>, <uunet!pdn!tct!chip>
 "Most of my code is written by myself.  That is why so little gets done."
                 -- Herman "HLLs will never fly" Rubin

chip@tct.uucp (Chip Salzenberg) (03/15/91)

According to jimad@microsoft.UUCP (Jim ADCOCK):
>The reason to make ["operator new" and "operator delete"] "really special"
>[is] that implementations can reasonably be expected to exist that don't
>use normal function calls for implementing such things.

Any such implementation is free to create a wrapper when the
programmer requests such a function's address.  This solution would
simplify the programming model by making all functions addressable.
But, in truth, the issue of function addressing is relatively minor; I
don't think any programmer would resent "void *op_new(size_t n)".

>If you do not _insist_ on _doing_ such pointer conversions, then there is
>no reason to _require_ all C++ compilers to support them.

My only point all this time has been that the ARM, *right now*,
basically requires that |T*| to |void*| conversion be reversible.
Note that I take no sides on whether that requirement is a Good Thing
or not.  Nevertheless, what the ARM guarantees, I will use.

>According to chip@tct.uucp (Chip Salzenberg):
>|As has been mentioned before, ANSI C requires that |void*| and |char*|
>|have the same representation.  If ANSI C++ does not conform to this
>|requirement, then compatibility with C libraries is compromised.
>
>This is rightly a quality-of-implementation issue.

It is only a QOI issue if the ANSI committee punts.  They may yet take
a firm stand.  Reports of the death of C compatibility as an ANSI C++
issue are greatly exaggerated.

>What does it mean to the ANSI-C *Language* to say that void*'s and char*'s
>have the "same representation" ???  I claim: nothing.

Actually, it has a very important meaning: In the absence of function
prototypes, a formal parameter and an actual parameter are considered
to be correctly typed even if one is |void*| and the other is |char*|.
For example, this program is guaranteed to work under ANSI C:

   f1.c:

      main()
      {
          blah("howdy");
      }

   f2.c:

      #include <stdio.h>
      blah(void *p)
      {
          puts(p);
      }

See?  It *is* a language issue, after all.

>I claim it means nothing to the C++ *Language* to say that member
>addresses within a labeled section must have ordered addresses ...

Well, that guarantee can also be used.  An example using offsetof()
comes to mind; however, it is too contrived for me to post without
embarassment.  [:-}]  But, that is a fight for another day.
-- 
Chip Salzenberg at Teltronics/TCT     <chip@tct.uucp>, <uunet!pdn!tct!chip>
 "Most of my code is written by myself.  That is why so little gets done."
                 -- Herman "HLLs will never fly" Rubin

jimad@microsoft.UUCP (Jim ADCOCK) (03/19/91)

In article <27E01E50.64DF@tct.uucp> chip@tct.uucp (Chip Salzenberg) writes:
|   f1.c:
|
|      main()
|      {
|          blah("howdy");
|      }
|
|   f2.c:
|
|      #include <stdio.h>
|      blah(void *p)
|      {
|          puts(p);
|      }
|
|See?  It *is* a language issue, after all.

If this *is* a language issue -- then it points out that the issue
of whether or not char* and void* has the "same representation" or not 
is moot -- because in C++ this example would fail to link -- even on
implementations wherea char* and void* have the same representation.
Thus I reiterate my statements that the C++ committee ought to stay
focused on specifying *language* not *implementation.*

chip@tct.uucp (Chip Salzenberg) (03/20/91)

According to jimad@microsoft.UUCP (Jim ADCOCK):
>In article <27E01E50.64DF@tct.uucp> chip@tct.uucp (Chip Salzenberg) writes:
>|
>|   f1.c:
>|      main() { blah("howdy"); }
>|
>|   f2.c:
>|      #include <stdio.h>
>|      blah(void *p) { puts(p); }
>
>If this *is* a language issue -- then it points out that the issue
>of whether or not char* and void* has the "same representation" or not 
>is moot -- because in C++ this example would fail to link ...

Wrong on both points.  Declare blah() as 'extern "C"' in both modules,
and this example will link (and run) just fine.  Or do you wish to
wave a magic wand and make 'extern "C"' go away, too?

>Thus I reiterate my statements that the C++ committee ought to stay
>focused on specifying *language* not *implementation.*

Fine, as long as you always emphasize that word "ought," and don't
criticize people for using the language as it is specified.
-- 
Chip Salzenberg at Teltronics/TCT     <chip@tct.uucp>, <uunet!pdn!tct!chip>
   "All this is conjecture of course, since I *only* post in the nude.
    Nothing comes between me and my t.b.  Nothing."   -- Bill Coderre

jimad@microsoft.UUCP (Jim ADCOCK) (03/28/91)

In article <27E765DF.12BD@tct.uucp> chip@tct.uucp (Chip Salzenberg) writes:
|According to jimad@microsoft.UUCP (Jim ADCOCK):
|>In article <27E01E50.64DF@tct.uucp> chip@tct.uucp (Chip Salzenberg) writes:
|>|
|>|   f1.c:
|>|      main() { blah("howdy"); }
|>|
|>|   f2.c:
|>|      #include <stdio.h>
|>|      blah(void *p) { puts(p); }
|>
|>If this *is* a language issue -- then it points out that the issue
|>of whether or not char* and void* has the "same representation" or not 
|>is moot -- because in C++ this example would fail to link ...
|
|Wrong on both points.  Declare blah() as 'extern "C"' in both modules,
|and this example will link (and run) just fine.  Or do you wish to
|wave a magic wand and make 'extern "C"' go away, too?

No I don't want to make 'extern "XYZ"' go away -- I just want people
to recognize such for what it is -- a totally implementation dependent
section of code that cannot be properly consider "C++" code.

Especially note that on vendor systems that don't have a matching C compiler,
an 'extern "C"' section would be meaningless -- except they might 
choose to make such a declaration a NO-OP, just in the odd chance that
they might be able to accept such code "successfully."

Again, I point out that 'extern "XYZ"' sections must be considered
strictly implementation dependent, and thus cannot be rightly considered
"C++" code.  If FooBar Corp creates a C++ compiler with some implementation
dependent features, and someone writes code making use of those 
implementation features, is that "C++" code?  I claim that only
the portable parts of the code can be considered "C++" -- the rest
of the code is more properly thought of as "FooBar" code.  "C++" is
that relatively small part of a totally compiler implementation that
the vast majority of independent compiler vendors and C++ programmers
can agree on.

On some systems 'extern "XYZ"' will need to do more than change the 
mechanisms of name mangling -- it may also change link mechanisms,
calling conventions, word sizes, keyword extensions recognized, etc.
Thus, as far as I'm concerned, a void* in an 'extern "XYZ"' section
may be a totally different beast from a void* in a C++ section.  So I
consider 'extern "XYZ"' meaningless for this discussion.

|>Thus I reiterate my statements that the C++ committee ought to stay
|>focused on specifying *language* not *implementation.*
|
|Fine, as long as you always emphasize that word "ought," and don't
|criticize people for using the language as it is specified.

I criticize people who think the language *is* specified.  It is not.
Some people seem to be saying C++ == ARM + Cfront + ANSI-C

I disagree.  I claim C++ == ARM - annotations

As of today, anything outside of (ARM - annotations) is not C++,
but rather implementation dependencies.

Perhaps someday the ANSI-C++ committee will issue documentation that
includes constraints -- like the ANSI-C specs have -- constraints specifying
"errors" that a conforming C++ compiler has to correctly reject in order to be 
conforming.  Until such a time, pretty much "anything goes."