[comp.lang.c++] #defines with parameters

pcg@aber-cs.UUCP (Piercarlo Grandi) (12/06/88)

First of all I wish to admit I have been inaccurate; as somebody remarked
"noalias" was only briefly in the dpANS C, and "far" and "near" never
actually made it in the official documents sent out for review. But all
were for a long time in various drafts, and as usually happens people
actually advertised "far" and "near" as examples of draft, unofficial
ANSI C conformance. Thank goodness X3J11 frustrated their efforts in the end.

In article <225@twwells.uucp> bill@twwells.UUCP (T. William Wells) writes:

    In article <277@aber-cs.UUCP> pcg@cs.aber.ac.uk (Piercarlo Grandi) writes:
    : As I understand it, this means that char,short,int,long are distinct
    : types, whereas unsigned and signed are type modifiers. In a sense then,
    : the view of dpANS is that unsigned [int] and [signed] int are the same
    : type, only that one has a sign and the other hasn't.
    
    If you want to, you can think of this as types and modifiers, though
    the standard does not speak of them that way. If so, here is how you
    do it:
    
    The types:
    
    	char
    	int
    
    That's right. Only two.
    
    The modifiers:
    
    	unsigned        signed
    	short           long
    
    The first pair modifies char or int; the second int only. You can
    specify only one of each pair.

OK, the standard does not speak of modifiers and types, but then the
result is the long lists you give and lots of confusion. I think there
are two issues here, one the introduction of signed as a keyword, and
the other neat ways of defining semantics. As to the latter, the whole
type system would be greatly simplified if one were to say that

[1] there are two distinct types, int and unsigned; they are distinct
types because different rules of arithmetic apply to them. This would
make it clear, and I have found very few people that actually
understand that unsigned is not just "positive int", that the same
operators applied to int and unsigned have very different semantics.

[2] Each of the two distinct types may come in three different extra
lengths, char, short and long, that are exactly equivalent among them
for the same type except for the different size.

As to the last point, char has been so far just a short short; a char
value can be operated upon exactly as an integer. Historically char
constants have been really the size of integer constants...

I would have liked, instead of the unnecessary and confusing signed
modifier, a nice range(abs_max) modifier for types integer and
unsigned, and char/short/long defined in terms of it. This would have
added tremendously to the portability of programs. The lack of some
sort of ranges and the use of short and long are one of the few things
I dislike in Algol68 and C.

I hope such an interesting extension makes it into C++, or at least into
the GNU C++ compilers (hint hint Michael Tiemann!).

    There is no relationship between char and signed or unsigned char,
    other than that they occupy the same amount of storage.

Now I reiterate the question: why was a new keyword introduced "signed"
when it just sufficed to sanction the existing practice of some
compilers (PCC had it, more recent BSD versions fixed this "bug") to
say "int char" or better "char int"?

Since storage classes, modifiers, and type names could be given in any
order in a declaration; most compilers therefore simply collected
keywords, and set a flag in the symbol table entry for the variable for
each keyword they collected.

This meant that "int static char int static" was accepted as a legal
declaration by many compilers :-). This has been later often
"corrected", notably in 4BSD pcc. Amusingly it persists even today in
other compilers, among them g++ 1.27, where interestingly "sizeof (char
int)" is 4 and "sizeof (int char)" is 1 on a 68020... Of course in my
opinion both ought to be 1, or an order storageclass/modifier/type
might be enforced, making "int char" illegal and "sizeof (char int)"
equal to 1.
-- 
Piercarlo "Peter" Grandi			INET: pcg@cs.aber.ac.uk
Sw.Eng. Group, Dept. of Computer Science	UUCP: ...!mcvax!ukc!aber-cs!pcg
UCW, Penglais, Aberystwyth, WALES SY23 3BZ (UK)

bill@twwells.uucp (T. William Wells) (12/08/88)

In article <330@aber-cs.UUCP> pcg@cs.aber.ac.uk (Piercarlo Grandi) writes:
:                                                          I think there
: are two issues here, one the introduction of signed as a keyword, and
: the other neat ways of defining semantics. As to the latter, the whole
: type system would be greatly simplified if one were to say that

`Signed' is there so that we can have `signed char'. As to why one
would want that, the original C specified that, while a char was an
integer, whether it could contain negative values is up to the
implementer. `Signed char' has the same size as a char, but it is
always signed. (And yes, there are good reasons not to change the
current definition of `char').

`Signed' was then allowed as a modifier for the other types, for
reasons of symmetry.

(This is all in the Rationale, in section 3.1.2.5.)

As for simplifying the type system; the current one is as simple as
is possible for it to be, given that it *must* be compatible with the
old one, and given the addition of a `signed char'.

: [1] there are two distinct types, int and unsigned; they are distinct
: types because different rules of arithmetic apply to them. This would
: make it clear, and I have found very few people that actually
: understand that unsigned is not just "positive int", that the same
: operators applied to int and unsigned have very different semantics.

Then you have simply been hanging around inexperienced C programmers.

Not only that, but you are wrong about there being two distinct
types; from the view you are adopting, there are three.  See below.

: [2] Each of the two distinct types may come in three different extra
: lengths, char, short and long, that are exactly equivalent among them
: for the same type except for the different size.

This would be nice if it were compatible with history; however...

: As to the last point, char has been so far just a short short; a char
: value can be operated upon exactly as an integer. Historically char
: constants have been really the size of integer constants...

This is false. Char has not been and ought not be made just a short
short.  Char is a funny type. It is neither signed nor unsigned though
it is always implemented as one or the other.

Let me repeat this: there are three signednesses in C:

    1) integers - these have positive and negative values.
    2) unsigned - these have positive values only.
    3) char - these have positive values. Sometimes they have
       negative values as well but it depends on the implementation.

: I would have liked, instead of the unnecessary and confusing signed
: modifier, a nice range(abs_max) modifier for types integer and
: unsigned, and char/short/long defined in terms of it. This would have
: added tremendously to the portability of programs.

And here again you are wrong. While it is sometimes nice to be able
to specify numeric ranges, it is hardly necessary to do so for
portability. Understanding the C paradigm, one can use the existing
types to create portable code. I do it all the time; my code is
regularly ported to everything from micros to mainframes.

:                       to sanction the existing practice of some
: compilers (PCC had it, more recent BSD versions fixed this "bug") to
: say "int char" or better "char int"?

`Char int' is, and always has been, illegal. I don't know of a single
compiler that accepts it. I just checked: the one on my system
(Microport, a system V) and the one on my work machine (SunOS,
more-or-less BSD) reject it. I believe that both are based on the
PCC.  I know that the C compiler (from ISC) we used on our PDP-11 and
our VAX, between four and six years ago, both rejected `char int' and
similar constructs. I'm almost certain that both compilers were PCC
based.

---

Let me suggest that you should carefully read the various books on C,
starting with K&R (the original), the Harbison & Steele book, and
some other good C textbooks. *Then* read the latest draft of the C
standard.  Do all of this without reference to what you would wish
the language to be. Understand what it *is*; develop familiarity with
the paradigms and facts of the language. Then, and only then, will be
you be equipped to complain about C's deficiencies.

In the mean time, I'm going to stop responding to your statements
about what C ought to be.  You have formed these opinions in the
absence of knowledge; I have come to believe that I am wasting my
time trying to correct them.

If you have questions about the language, go ahead and ask them, and
you'll even get a civil reply, but please stop telling us what the
language should be until you know what it *is*.

---
Bill
{uunet|novavax}!proxftl!twwells!bill

pcg@aber-cs.UUCP (Piercarlo Grandi) (12/12/88)

I realize that in my crudeness and brutality there is no hope for me to
achieve the extremely rarified levels of wisdom and learning of certain
people endowed with a quick grasp of issues and gentlemany manners of debate.

I therefore appeal (bowing my head, palms joined :->) to higher authority.

Let me quote and summarize from one such easily recognizable higher authority,
and repeat my own contentions (if it is boring for you, think how it is for me):

-----------------------------------------------------------------------------

#   4. What's in a name [ .... ]
#   Objects declared as characters ("char") are large enough to store any
#   member of the implementation's character set, and if a genuine character
#   from that character set is stored ina character variable, its value is
#   equivalent to the integer code of that character. Other quantities may be
#   stored in a character variable, but the implementation is machine
#   dependent.

character type == an integer type of sufficient length, whether "unsigned" or
"int" is up to the implementation.

#   Up to three sizes of integer, declared "short int" "int", and "long int"
#   are available.  [ .... ]

integer type == any one of the three lengths of "int", not just "int".

#   Unsigned integers, declared "unsigned", obey the laws of arithmetic
#   modulo "2^n", where "n" is the number of bits in the representation. (on the
#   PDP-11, unsigned long quantitied are not supported).

unsigned integer type == "unsigned" integer of all lengths, except of the
PDP-11. Semantics are different from thsoe of integer types, as they obey the
rules of modular, not algebraic, arithmetic.

#   [ .... ] Because objects of the foregoing types can be usefully interpreted 
#   as numbers, the will be referred to as "arithmetic" types. Types "char" and
#   "int" of all sizes will be collectively called "integral" types. [ .... ]

character type == "char", some large enough integer or unsigned integer type;
unsigned integer type == "unsigned" of all lengths;
integer type == "int" of all lengths (occasionally includes also "unsigned"s);
integral type == all three of them.
arithmatic type == integral types plus all lengths of "float".

#   6.1 Characters and integers
#   A character or a short integer may be used whenever an integer is used.
#   In all cases the value is converted to an integer.

There is no behavioural difference between char, short and other lengths of
"int", but for their range.

#   Conversion of a shorter integer to a longer always involves sign
#   extension; integers are signed quantities.

Integer types involve sign extension, by contrast with unsigned integer types.

#   Whether or not sign extension occurs for characters is machine dependent,
#   [ .... ].

Whether or not "char" is an integer or unsigned integer type is not prescribed.

#   [ .... ] When a longer integer is converted to a shorter or to a "char",
#   it is truncated on the left; excess bits are simply discarded.

There is no behavioural difference between "char" and "short", or other
lengths, except their size.

#   6.5 Unsigned
#   Whenever an unsigned integer and a plain integer are combined, the
#   plain integer is converted to unsigned and the result is unsigned.
#   The value is the least unsigned integer congruent to the signed
#   integer (module "2^wordsize"). [ .... ] When an unsigned integer is
#   converted to "long", the value of the result is the same numerically
#   as that of the unsigned integer. [ .... ]

The rules for conversions involving unsigned integers are different from
those for integers.

#   7. Expressions [ .... ]
#   The handling of overflow and divide check is expression evaluation is
#   machine dependent. [ .... ]

Note insofar overflow is concerned this only applies to integer types, as
unsigned integer types cannot overflow by definition. In other words,
exceeding the range of a length of "int" is not well defined, while exceeding
the range of a length of "unsigned" is.

Another case where there are behavioural differences between unsigned integer
and integer types.

#   7.2 Unary operators
#   [ .... ] The result of the unary "-" operator is the negative of its
#   operand. The usual arithmetic conversions are performed. The negative of
#   an "unsigned" quantity is computed by subtracting its value from "2^n",
#   where "n" is is the number of bits in an "int". [ .... ]

Another case where there are behavioural differences between unsigned integer
and integer types.

#   7.5 Shift operators
#   [ .... ] The right shift is guaranteed to be logical (0 fill) if "E1"
#   is "unsigned"; otherwise it may be (and is, on the PDP-11), arithmetic
#   (fill by a copy of the sign bit).

Another case where there are behavioural differences between unsigned integer
and integer types.

#   8.2 Type specifiers
#   [ .... ] The words "long", "short" and "unsigned" may be thought of as
#   adjectives; the following combinations are acceptable: [ .... ]

Here lies the crux of the matter. Throughout it is repeatedly and explicitly
stated that unsigned integer types behave differently from integer types, and
that the character type does not behave differently from a sufficiently
long/short unsigned integer or integer type.

Given this and the quoted phrase, it is apparent in hindsight that syntax and
semantics are incomplete, as there is no way to ensure the signedness of a
"char" (a similar problem exists with bit fields), and that syntax does not
properly reflect semantics.

dpANS C addresses the first point only, adding the "signed" keyword that can
thought of as another adjective and adding several cases to the table of
acceptable combinations.

My contentions (for the last time!) are that

    [1] this is not necessary, as it is more natural to drop the pretense
    that "char" is a type distinct from "int", and instead adopt the notion
    that "char" is like "short", an adjective that modifies the length of its
    base type;

    [2] it does not resolve the issue of making clear that "unsigned" is
    semantically different from "int", while the various lengths of either
    type are, but for the different ranges, semantically equivalent among
    themselves, and this distinction is important;

    [3] both points can be economically addressed by redefining as integral
    types the class of all integer and unsigned types, as integer types the
    various lengths of "int", as unsigned types the various lengths of
    "unsigned", and as length adjectives/modifiers the keywords "char",
    "short", "long"; when the adjective is omitted, the base type has the
    length of "short" or "long", depending on the implementation; when the
    base type type is omitted, "int" is presumed, except for length "char",
    where the choice is implementation dependent.

    [4] the proposed rationalization, provided that "unsigned int" is made as
    a special case equivalent to "unsigned", is backward compatible;

    [5] because of a easily made "mistake", some compilers, in the past or
    now, did not/do complain when the rationalized syntax was/is used, and
    this could be easily blessed instead of eradicated;

    [6] if it is felt desirable to substantially modify the declaration of
    "int" or "unsigned" types, a new keyword could be introduced for range
    definition, or the syntax for bit fields could be allowed outside

-------------------------------------------------------------------------

Kind reader, having had the patience to reach this point, make a last effort,
and please circle what you believe to be the correct answers:

[1] The material quoted above:

    [A] Is excerpted in an accurate, substantial and non misleading way
	from "The C programming Language - Reference Manual" (1978) the
	authoritative definition of Classic C.
    [B] I have made it up.

[2] The summaries I have made of the various passages quoted:

    [A] Accurately reflect the contents of said Reference Manual, or at
	least a consistent and historically defensible interpretation of
	those contents.
    [B] I have never read/understood the Reference Manual.

[3] The final contentions and suggestions are:

    [A] Supported by fair and reasonable technical arguments, based on the
	contents of said Reference Manual, as well as other more mundane points.
    [B] My advisor (if I had one) must be on drugs.
-- 
Piercarlo "Peter" Grandi			INET: pcg@cs.aber.ac.uk
Sw.Eng. Group, Dept. of Computer Science	UUCP: ...!mcvax!ukc!aber-cs!pcg
UCW, Penglais, Aberystwyth, WALES SY23 3BZ (UK)

ok@quintus.uucp (Richard A. O'Keefe) (12/14/88)

In article <375@aber-cs.UUCP> pcg@cs.aber.ac.uk (Piercarlo Grandi) writes:
>My contentions (for the last time!) are that
>
>    [1] this is not necessary, as it is more natural to drop the pretense
>    that "char" is a type distinct from "int", and instead adopt the notion
>    that "char" is like "short", an adjective that modifies the length of its
>    base type;
>

That change would *FORCE* compiler writers to break working code.
There are programs which were written for machines with unsigned chars
(predating the introduction of 'unsigned char') where the programmers
relied on the char range being 0..255.  While this was not _portable_,
that implementation was _permitted_.  If you now rule that
	char = char int and is *signed*
then to conform to the standard, compiler writers for those machines
would _have_ to make plain chars signed.  The method X3J11 chose
_allows_ compiler writers for those machines to keep their old semantics
for plain 'char', and thus lets them offer backwards compatibility to their
customers.  It simply wasn't possible to make the new standard compatible
with all the old compilers, but X3J11 didn't introduce incompatibility
lightly.

Don't forget, the signed/unsigned ambiguity of 'char' was provided in the
first place for very practical reasons:  there are machines where signed
chars are more efficient, and there are machines where unsigned chars are
more efficient.  That is still true, so we _still_ want a way of saying
"whichever of signed byte/unsigned byte is cheaper, I promise not to care".

To be perfectly frank, I don't expect to have any use for 'signed char'.
I particularly don't want all my programs which use 'char' (because I
wanted the efficiency and didn't need a large range) to be forced to use
signed bytes, so the "char = char int = signed char" rule, while it
would not _break_ my programs, would definitely make them less efficient.
NO THANK YOU!

pcg@aber-cs.UUCP (Piercarlo Grandi) (12/18/88)

In article <860@quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes:

	[ on my suggestion to make "char int" denote guaranteed signed chars ]

    That change would *FORCE* compiler writers to break working code.

Why ever? I have never advocated removing the rule that "char" by itself is
either "int" or "unsigned", merely added a way to guarantee that it would be
signed.

    There are programs which were written for machines with unsigned chars
    (predating the introduction of 'unsigned char') where the programmers
    relied on the char range being 0..255.  While this was not _portable_,
    that implementation was _permitted_.  If you now rule that

	char = char int and is *signed*

Never implied this. (I care about backwards compatibility, even if I don't
like it where is rewards, like in your example, non portable practices).

    but X3J11 didn't introduce incompatibility lightly.

Yet it introduced an unnecessary keyword, the perpetuates the unfortunate
syntax that favors the confusions that "char" is indeed a type and "unsigned"
is indeed just a variant of "int".

    [ .... ] we _still_ want a way of saying "whichever of signed
    byte/unsigned byte is cheaper, I promise not to care".

Indeed!. Since 'char' is  to be a length modifier, I assumed it was obvious
that in case the base type is missing, it would default to either 'int' or
'unsigned', while for all other length modifiers it would always default to
'int'.

In dpANS C the rule is conversely that the type modifier 'signed' is the
default for all the lengths of 'int', but not for type 'char' where the
default may be either 'signed' or unsigned'.

    To be perfectly frank, I don't expect to have any use for 'signed char'.

I have, I have! (I mean "char int" of course). In many many cases one wants
to really consider "char" a length specifier, and create arrays of byte sized
integers or unsigneds (or don't cares).
-- 
Piercarlo "Peter" Grandi			INET: pcg@cs.aber.ac.uk
Sw.Eng. Group, Dept. of Computer Science	UUCP: ...!mcvax!ukc!aber-cs!pcg
UCW, Penglais, Aberystwyth, WALES SY23 3BZ (UK)

ok@quintus.uucp (Richard A. O'Keefe) (12/19/88)

In article <420@aber-cs.UUCP> pcg@cs.aber.ac.uk (Piercarlo Grandi) writes:
>In article <860@quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes:
>	[ on my suggestion to make "char int" denote guaranteed signed chars ]
>    That change would *FORCE* compiler writers to break working code.

>Why ever? I have never advocated removing the rule that "char" by itself is
>either "int" or "unsigned", merely added a way to guarantee that it would be
>signed.

So sorry!  I thought you wanted your proposal to be self-consistent.
Silly me.  The proposal, as I understood it, was that 'char' was to be a
size modifier _just_ _like_ 'short' and 'long'.  Now the rule is that
	short x;	==	short int x;
	long x;		==	long int x;
so naturally I assumed that you were proposing that
	char x;		==	char int x;
(Certainly "short short" would have worked like that.)

>    but X3J11 didn't introduce incompatibility lightly.
>
>Yet it introduced an unnecessary keyword.

It has been explained that X3J11 did not "introduce" the 'signed' keyword,
but accepted it as _existing_ _practice_.  If someone else has tried to
deal with a problem, you don't spit in their face even if you don't like
their solution much.  

>Indeed!. Since 'char' is  to be a length modifier, I assumed it was obvious
>that in case the base type is missing, it would default to either 'int' or
>'unsigned', while for all other length modifiers it would always default to
>'int'.

I'm sorry, but I don't see how such an exception is "obvious".

I personally regard the whole of C's integer type system as radically
misguided, and would argue with the utmost vehemence against its adoption
in any other language.  But C is the way it is, and there is no use
crying over spilt milk.  It's usable, it's understood.  Let's freeze it
and get on to better things.

chip@ateng.ateng.com (Chip Salzenberg) (12/22/88)

According to pcg@aber-cs.UUCP (Piercarlo Grandi), concerning the fact that
the signedness of char is implementation-defined:

>[...] it is apparent in hindsight that syntax and
>semantics are incomplete, as there is no way to ensure the signedness of a
>"char" (a similar problem exists with bit fields), and that syntax does not
>properly reflect semantics.

Sure.  But for hysterical -- oops, I meant historical -- reasons, X3J11
couldn't fix it.  They did provide a way to specify signed and unsigned
chars when we care to, which is all we really need anyway.

>My contentions (for the last time!) are that
>    [1] this is not necessary, as it is more natural to drop the pretense
>    that "char" is a type distinct from "int", and instead adopt the notion
>    that "char" is like "short", an adjective that modifies the length of its
>    base type;

Well, sure.  But you're too late.

X3J11 did a good job.  Let's leave well enough alone.
-- 
Chip Salzenberg             <chip@ateng.com> or <uunet!ateng!chip>
A T Engineering             Me?  Speak for my company?  Surely you jest!
	   Beware of programmers carrying screwdrivers.