[net.lang.c] Type checking for typedef's

garry@batcomputer.TN.CORNELL.EDU (Garry Wiegand) (05/28/86)

In a recent article sher@rochester.UUCP (David Sher) wrote:
>I just tripped over another "feature" of c++.
>This feature is that enums are not types.  
>They look like types but really are synonomous with sets of 
>constant definitions and typedefs of ints.  This strikes me as wrong...

I am reminded of a long-standing pet peeve of mine: I can get
type-checking on structures and unions. I can get type-checking
on primitive types (int, float, ...). I CANNOT get type-checking
on datatypes declared via 'typedef' EXCEPT in terms of their underlying
types. (I assume this all is still true in the new language standard.)

What I would like is for typedef names to be considered by the compiler as 
DIFFERENT from the underlying types. The compiler should then allow an implicit
(or explicit) cast back and forth between the derived and underlying types - 
this will avoid breaking existing code. The improvement over the current state 
of things will happen when I ask the compiler "please tell me about possibly 
nasty implicit casts!"
 

Note 1: people who are in the habit of doing things like:

        typedef unsigned char    ubyte;

   and who want to use this new feature will have to switch to:

        #define ubyte  unsigned char

   unless they really want "ubytes" to be non-computable.

Note 2: At some point (of course) you have to give a value to a derived-
   type variable. If you compile habitually with the new message turned
   on, then in such a situation you'll have to use an explicit cast.

Note 3: The same principle could be extended to structures that are equivalent 
   though not identical and to enum's.

Is it reasonable? Is it hard to implement? Comments? (Followups to
net.lang.c, pls).
-- 
garry wiegand   (garry%cadif-oak@cu-arpa.cs.cornell.edu)

desj@brahms.BERKELEY.EDU (David desJardins) (05/30/86)

In article <361@batcomputer.TN.CORNELL.EDU> garry%cadif-oak@cu-arpa.cs.cornell.edu.arpa writes:
>What I would like is for typedef names to be considered by the compiler as 
>DIFFERENT from the underlying types. The compiler should then allow an
>implicit (or explicit) cast back and forth between the derived and under-
>lying types -- this will avoid breaking existing code. The improvement over
>the current state of things will happen when I ask the compiler "please tell
>me about possibly nasty implicit casts!"
> 
>Is it reasonable? Is it hard to implement? Comments?

   Yes.  No.  This is how almost all typed languages (except C) handle
their types, and I agree that it is vastly preferable.  But don't hold
your breath waiting for C programmers to give up their super-weak typing.

   -- David desJardins

P.S. If you want to flame me do so via mail; I don't read this group...

henk@ace.UUCP (Henk Hesselink) (06/01/86)

In article <361@batcomputer.TN.CORNELL.EDU> garry%cadif-oak@cu-arpa.cs.cornell.edu.arpa writes:

> I am reminded of a long-standing pet peeve of mine: I can get
> type-checking on structures and unions. I can get type-checking
> on primitive types (int, float, ...). I CANNOT get type-checking
> on datatypes declared via 'typedef' EXCEPT in terms of their underlying
> types. (I assume this all is still true in the new language standard.)
> 
> 
> Is it reasonable? Is it hard to implement? Comments?.

It is certainly reasonable, a little more type-checking in C would not
be at all amiss, but:

"It must be emphasised that a typedef declaration does not create a
new type in any sense" (K&R, page 141).

The problem is that typedefs get converted back to basic types fairly
early on in pass 1 of pcc (which, plus derivatives thereof, constitutes
a sizeable chunk of all C compilers), while serious type-checking is
mostly done in pass 2.  This means 1) letting pcc1 keep typedefs intact,
2) augmenting the intermediate code to allow this kind of information
to be specified and 3) getting pcc2 to do something useful with it.
Hard to implement?  Depends on where you stand, but certainly not some-
thing you'd hack up in an afternoon.

--
Henk Hesselink
ACE Associated Computer Experts bv.
Amsterdam, The Netherlands

henry@utzoo.UUCP (Henry Spencer) (06/04/86)

> What I would like is for typedef names to be considered by the compiler as 
> DIFFERENT from the underlying types....
> Is it reasonable? Is it hard to implement? Comments?...

Not unreasonable.  Not too hard to implement.  Not C, either.  Too many
existing programs would break.  Typedef is often used to parameterize
machine-dependencies like integer sizes, where the "macro" interpretation
of typedef is necessary.  It's much too late to change this now.
-- 
Usenet(n): AT&T scheme to earn
revenue from otherwise-unused	Henry Spencer @ U of Toronto Zoology
late-night phone capacity.	{allegra,ihnp4,decvax,pyramid}!utzoo!henry

greg@utcsri.UUCP (Gregory Smith) (06/04/86)

In article <138@ace.UUCP> henk@ace.UUCP (Henk Hesselink) writes:
>In article <361@batcomputer.TN.CORNELL.EDU> garry%cadif-oak@cu-arpa.cs.cornell.edu.arpa writes:
>
>> I am reminded of a long-standing pet peeve of mine: I can get
>> type-checking on structures and unions. I can get type-checking
>> on primitive types (int, float, ...). I CANNOT get type-checking
>> on datatypes declared via 'typedef' EXCEPT in terms of their underlying
>> types. (I assume this all is still true in the new language standard.)
>> 
>> 
>> Is it reasonable? Is it hard to implement? Comments?.
>
>It is certainly reasonable, a little more type-checking in C would not
>be at all amiss, but:
>
>"It must be emphasised that a typedef declaration does not create a
>new type in any sense" (K&R, page 141).
>
It *could* be optional.

>The problem is that typedefs get converted back to basic types fairly
>early on in pass 1 of pcc (which, plus derivatives thereof, constitutes
>a sizeable chunk of all C compilers), while serious type-checking is
>mostly done in pass 2.  This means 1) letting pcc1 keep typedefs intact,
>2) augmenting the intermediate code to allow this kind of information
>to be specified and 3) getting pcc2 to do something useful with it.

I don't think the second pass would need to know about it.
Types are stored in PCC as a string of 2-bit qualifiers preceding
a 'basic type' specifier:

	..!QUAL!QUAL!QUAL!QUAL!BASIC ( # of qualifiers depends on word size )

The qualifiers are FTN,PTR,ARY ( function returning, pointer to, array of )
and the fourth state (00) is used to 0-pad on the left. The basic types
are things like INT, UNSIGNED, STRTYPE ( i.e. struct ) ENUMTYPE ( enums ).
By adding a TYPDEF basic type, the typedef information could be retained and
checked in the first pass. The type words with TYPDEF could then be easily
converted to their actual types before writing the intermediate file. In fact
this is exactly what is done with enums, which are not seen by the second
pass and appear in the intermediate file as int constants.
The big problem is deciding *when* to issue a warning and when to
overlook mismatches.  Suppose you have 'typedef int foo;' and 'foo
i,j,k;'. then obviously i=j is ok, and so is i=j+k. But the '+' sees
two typedefs in the latter, and has to be smart enough to check them,
find out what they really are, and bump the result back to 'foo'. And
if you have 'int q', is j+q of type int or foo? do 'q +=i ' and/or
'i=j+q' give warnings? If good semantics for this could be worked out,
it could be a real aid to writing portable programs.

>Hard to implement?  Depends on where you stand, but certainly not some-
>thing you'd hack up in an afternoon.
>
True.

-- 
"We demand rigidly defined areas of doubt and uncertainty!" - Vroomfondel
----------------------------------------------------------------------
Greg Smith     University of Toronto      UUCP: ..utzoo!utcsri!greg

jpn@teddy.UUCP (John P. Nelson) (06/06/86)

>> What I would like is for typedef names to be considered by the compiler as 
>> DIFFERENT from the underlying types....
>
>Not unreasonable.  Not too hard to implement.  Not C, either.  Too many
>existing programs would break.

I think what is needed is for LINT to treat such typedef names as true
distinct types.  Having the COMPILER treat them as different is probably
not very useful.  Perhaps this should be an OPTION to lint, so that old
programs don't lose completely. This would make lint a much more valuable
tool, and would make information hiding easier in C programs.

kdmoen@watcgl.UUCP (Doug Moen) (06/06/86)

garry wiegand:
>What I would like is for typedef names to be considered by the compiler as 
>DIFFERENT from the underlying types. The compiler should then allow an implicit
>(or explicit) cast back and forth between the derived and underlying types - 

I disagree.
There are a number of situations when I want to give a name to a type
expression *without* creating a new type as a side effect.

Certainly there ought to be a way to create new types, but this mechanism
should be orthogonal to the mechanism for giving names to type expressions.
Notice that making type naming and type creation into two separate mechanisms
is more flexible than forcing you to always get one with the other.

C++ already has a way to create new types: the class mechanism.

typedef int foo;	/* foo is a synonym for 'int' */

class bar {		/* bar is a new type with representation 'int' */
	int i;
};

To obtain what Garry wants (derived types), we need a way for 'bar'
to inherit all of the operations on 'int'.  There is currently no way
to do this.
-- 
Doug Moen (watmath!watcgl!kdmoen)
University of Waterloo Computer Graphics Lab

garry@batcomputer.TN.CORNELL.EDU (Garry Wiegand) (06/10/86)

In response to my suggestion of stronger type-checking on typedefs:

In a recent article greg@utcsri.UUCP (Gregory Smith) wrote:
>It *could* be optional.

Yes yes, I thought I said that! (Wouldn't want to make existing code
complain, you know :-).  In time, the warning might become the default.

>The big problem is deciding *when* to issue a warning and when to
>overlook mismatches.  Suppose you have 'typedef int foo;' and 'foo
>i,j,k;'. then obviously i=j is ok, and so is i=j+k. But the '+' sees
>two typedefs in the latter, and has to be smart enough to check them,
>find out what they really are, and bump the result back to 'foo'. And
>if you have 'int q', is j+q of type int or foo? do 'q +=i ' and/or
>'i=j+q' give warnings? If good semantics for this could be worked out,
>it could be a real aid to writing portable programs.

Perhaps it would work out if typedefs were thought of as making a *copy*
of the underlying datatypes. Rules would be:

    1) operations that were available on the original type (such as '+' or
       '=') would still be perfectly legit if ALL the operands were of the
       new type.

    2) Expressions that mixed things up would receive the new warning. Then
       the compiler would convert/cast things to a common type and proceed.
       The most logical choice for "common type" would be the underlying 
       primitive type.
    
    3) The knowledgeable programmer could avoid the warning by explicitly
       casting things so the expression ended up of uniform type. The compiler
       should be smart enough to know when an explicit cast is legitimate
       with respect to its operand, natch.

I have a full-knowledge-of-the-language C preprocessor somewhere in the 
works, and it sounds like this idea has a little value, so I'll see if I
can tickle it in. But I'm still a bit fuzzy on the semantics myself :-)

-- 
garry wiegand   (garry%cadif-oak@cu-arpa.cs.cornell.edu)