[comp.lang.c++] type/member tags

craig@gpu.utcs.utoronto.ca (Craig Hubley) (02/20/91)

In article <1991Feb19.191731.4137@pa.dec.com> lattanzi@decwrl.dec.com (Len Lattanzi) writes:
>In article <65451@brunix.UUCP> sdm@cs.brown.edu (Scott Meyers) writes:
>:In article <607@taumet.com> steve@taumet.com (Stephen Clamage) writes:
>:| >imagine a database query:
>:| 
>:| >	"from the list of shapes, show me all the circles of diameter > 30"
>:| 
>:| All you have to do is make a virtual function diameter() for all Shapes.

Yes, the existing C++ mechanisms provide a way to do this.  But there was
also a way to do templates with macros, which was prone to errors and abuse.
So templates were blessed.  I would suggest the same thing is true with this.

>:| A function which expects to operate on a class of type Shape may
>:| reaonably expect all the functionality of Shape to be present.  It may
>:| not reasonably expect specific behaviors of classes derived from Shape
>:| to be present.
>:
>:This is of course the right way to go, and most of the time it is a
>:workable solution, but there are times when it simply won't do.

Unfortunately many of us seem, including Bjarne, seem to be stuck in 
a mold of believing a general theory, like strong typing or object-
oriented hierarchies, and acknowledging exceptions to it instead of
reconsidering the theory.  So far as I am concerned, strong typing
means little in a language with conversion, and hierarchy is not the
only blessed way to share specifications or code, as templates prove.

>:I personally believe that we need language support for this kind of thing,
>:but it would be so prone to abuse and Bjarne is so dead-set against it that
>:I don't hold out much hope that it will be seriously considered by the ANSI

There are ways to cut the abuse.  Consider these possible solutions:

PROBLEM	accessing features that do not exist on all of the possible types of
	a passed object, e.g. "is this a circle of diameter > 30 ?"

TODAY	explicit type tag
	- inconsistent across user programs
	- cannot be added to imported classes

SOL.1	type switch, similar to value switch
	- functions specific to more specialized class available in case-arm
	- class name used to distinguish type of object

	class_switch x {
		case circle:	
			if (x.diameter > 30) then {
				...
			}
	}

SOL.2	.class member, similar to .this member
	- public data member class, read-only
	- class name usable only in equivalence expression
	- equivalence rules for derived types
	
	if (x.class == circle) && ((circle)x.diameter > 30)) then {
		...
	}

SOL.3	conditional cast, similar to catch
	- if cast is illegal, no action, entire expression returns false
	- if cast is legal, it occurs and rest of block is executed
	- requires new syntax (here represented as double parentheses)
	
	if ( ((circle)) x.diameter > 30) then {
		...
	}

SOL.4	as above, only testing for presence of members, not name of classes
	- avoids dependencies on current class/member structure
	- works for inherited members

The above would require anyone who really intended to abuse the information
to continue to keep his/her own type tags, so this abuse would not be
"blessed".  In fact, I think opportunities for abuse would markedly diminish
if one of the above mechanisms was added.  After all, 95% of your need for
a type tag would be covered, and why would you invent your own mechanism
for that abusive 5% ?  However, if you have to invent your own anyway, or
use a base class library that has its own, you are more likely to take
advantage of it and increase dependencies in your code.  At least IMHO.

>:committee.  Instead, I think that compiler vendors will add proprietary
>:support for it, and there will be a market free-for-all.  I know that at

This is to be avoided at all costs.  Arg, a grass-roots rebellion against
the language definition... ye gods.  Look what this did to COBOL and Pascal.

>:least one compiler vendor already has implemented such a facility for
>:getting type information at runtime.

I hope whatever they do is supportable as a standard, because you can bet
it'll end up as one.

>:The bottom line is that virtual functions are wonderful things and can
>:almost always be used to achieve what you want, but there are times when
>:you truly *do* need to get type information as you run.

I agree, and so do designers of all other o-o languages.  It just isn't
a controversy, except for Bjarne it seems.  At least I haven't seen any
evidence of someone else saying you should never see the type of an object.
I hope ANSI C++ doesn't end up as a half-baked collision of object-oriented
dogma and C dogma.  The mere fact that compiler developers are including it
is a strong sign of user demand, and it is obviously not ignorant demand.

>:Scott
>
>This is especially true for expanding library interfaces. I wished for

Frankly I think Bjarne's concern is entirely misplaced.  I would worry far
more about the dozens of weird mechanisms for getting at type information than
about non-robust uses of it, or about the possibility that someone will
invent their own style of virtual functions given that information.
Although I recently proposed a different kind of virtual function with
different overriding rules, I would rather work within, or change, the language
than invent such a scheme for use only in my own programs.  However, I can't
be stopped from doing it, even by hiding the type tag.  So what is gained by
hiding it, I don't know.

>an "interactive(istream&)" predicate but had my hands tied. I could
>either compare pointer-to-members to guess at class type (boo! hiss!)
>or use iostream state variables to record this information. Something
>like a property-list per object is probably the desire of most lisp hackers.

Speaking as an ex-Lisp hacker, right on.  But this needn't wait until runtime
to be resolved, and compromise C++'s precious efficiency.  There are such
things as optimizing compilers, which C-heads usually have never heard of
because the language's low-level constructs are so pervasive and hard to 
optimize.  I've heard C described as "welfare for mediocre compiler writers",
and after using C for 7 years I tend to agree.

> Len Lattanzi (Migration Software Systems Ltd 408 452 0527) <len@migration.com>

The next thing I expect compiler writers to do is hack the linker so that
new versions of compiled-in libraries can be patched in at runtime.  They
gotta go into the object code, hack out calls to the subroutines that changed
and replace 'em with a call to a jump table into the new library.  Compatible
with C ?  Hah ha hahahah... but there are definitely a lot of people, even on
the ANSI committee, that couldn't care less about C.  And I agree with 'em.
Who actually compiles C sources under C++ anyway ?
-- 
  Craig Hubley   "...get rid of a man as soon as he thinks himself an expert."
  Craig Hubley & Associates------------------------------------Henry Ford Sr.
  craig@gpu.utcs.Utoronto.CA   UUNET!utai!utgpu!craig   craig@utorgpu.BITNET
  craig@gpu.utcs.toronto.EDU   {allegra,bnr-vpa,decvax}!utcsri!utgpu!craig

rfg@NCD.COM (Ron Guilmette) (03/03/91)

I am cross-posing this response to comp.lang.c++ since it contains
information about a general problem that many C++ programmers (and
not just ones who are worried about the evolving standard) are concerned
about.

In article <27C95D3A.1715@tct.uucp> chip@tct.uucp (Chip Salzenberg) writes:
>According to dsouza@optima.cad.mcc.com (Desmond Dsouza):
>>Here are a few examples where you need to know the type of an object:
>
>"Need?"  That's a red flag for me... :-)
>
>>1. Persistent objects: When reading in one of these from disk, you
>>   need to know what constructor to call. Hence you need to encode in
>>   the persistent image the ClassId of the object.
>
>Presumably, a well-designed class hierarchy will use a virtual
>function to store objects; and a virtual function by definition will
>already know the exact type of the object it is storing.

As Joe Buck pointed out, storing (or transmitting) an object is *not*
a problem which requires any sort of special identification of the type
of an object, however retrieving (or receiving) an hunk of data which
represents the previous contents of some unknown type of thing requires
us to use some sort of agreed upon scheme whereby the transmitter sends
some unique code with the data block to indicate to the receiver what
type of C++ object the transmitted data came from.  Assuming that both
the transmitting program and the receiving program are written in C++,
and assuming that we want the receiving program to perform (or to fake)
the re-construction of the originally transmitted object, we need to
have some set of globally unique "type codes" (which the transmitter and
the receiver must agree upon).

We faced exactly this problem when I was working at MCC on the ES-kit
project.  We had a distributed multiprocessor system within which we
wanted to be able to migrate objects and to send "messages" (i.e.
member function calls) from one processing node to an object residing
on a different processing node.

The solution we came up with was not terribly elegant.  We ended up
hacking the compiler (g++) to get it to provide the actual string
representation of the class name to the OS kernel routine which was
responsible for forwarding messages between nodes.  The kernel then
had to do a lookup of the class name string within a table of all
class-name string known to the system in order to get a globally
unique 32-bit integer valued "code" for that particular class type.
This code was then shipped across the mesh as a part of the inter-node
"packet" representing the "message" to the remote object.

A much cleaner solution would have been to enlist the compiler & linker
to assign the globally unique "type codes" up front, prior to run-time.
This would have allowed us to avoid the (very expensive) table lookups
which we did within the kernel for each transmitted message.

Ada implementors are familiar with the problem of providing "globally
unique" identifying codes for things.  They face the same problem
when they go to implement Ada exceptions.  Various aspects of Ada
effectively create a requirement for an internal set of globally
unique identification "codes" for all of the Ada exceptions declared
throughout an Ada entire program.

In all Ada implementations I know of, these "globally unique" codes for
declared Ada exceptions are, in effect, generated at link-time by the
linker.  For each Ada exception declared within an entire Ada program
the Ada compiler generates one small (word sized) artificial variable.
When it subsequently needs a unique code for the given exception, it
simply uses the address of the associated "dummy" variable as the
globally unique "code" for the given exception.  Fortunately, the linker
see to it that all of these "dummy" variables get allocated to different
memory locations (during linking) so that the addresses of these
exceptions (i.e. their "codes") are indeed globally unique.

An identical scheme could be used to assign globally unique integer
codes to each class type within an entire (linked) C++ program.  For
each class type declaration compiled, a C++ compiler (or translator)
could generate a "dummy" variable with a particular (specially mangled)
name.  The address of that dummy variable could then be used as a
globally unique identifier for the class type itself.  For example,
given:

	class C {
		/...
	};

	void *vp;

	void example ()
	{
		vp = typeof (class C);
	}

A C++ translator could easily generate:

	struct C {
		//...
	};

	int __C__typeof_dummy;

	void *vp;

	void __exampleV ()
	{
		vp = (void *) &__C__typeof_dummy;
	}

This approach would work for C++ *translators* only so long as they are
connected to underlying C compilers which allocate uninitialized variables
(such as "__C__typeof_dummy") to "common".  ANSI C does not allow this
practice, but virtually all K&R C compilers and many "ANSI" C compilers
still do it anyway.

-- 

// Ron Guilmette  -  C++ Entomologist
// Internet: rfg@ncd.com      uucp: ...uunet!lupine!rfg
// New motto:  If it ain't broke, try using a bigger hammer.