[comp.lang.c] signed/unsigned char/short/int/long

gwyn@smoke.BRL.MIL (Doug Gwyn ) (11/29/88)

In article <277@aber-cs.UUCP> pcg@cs.aber.ac.uk (Piercarlo Grandi) writes:
>As I understand it, this means that char,short,int,long are distinct
>types, whereas unsigned and signed are type modifiers. In a sense then,
>the view of dpANS is that unsigned [int] and [signed] int are the same
>type, only that one has a sign and the other hasn't.

But that is NOT what the dpANS says.  unsigned int and signed int are also
distinct types.  They are guaranteed to have the same representation and
alignment requirements (for the values that they share) when used as
function arguments or union members, but that does not reflect on their
types.

>I would not agree that char has never been a length specifier for int;

Certainly not in the sense that you mean for "short".  "short int" is
valid, but "char int" is not and never has been.  char is a basic integral
type (of uncertain signedness).

>I would also not agree that in this matter dpANS respected existing
>practice; the introduction of the signed modifier is certainly an
>innovation, and a confusing one as well (judging from all the fuss
>about it and char).

I don't know what "fuss" you refer to.  Your complaint about "signed"
is the first complaint I have heard.  "signed" is in fact existing
practice in several compilers, although not in K&R 1st Edition.  It
really isn't necessary for anything except "signed char".  The reason
it is necessary there is that "char" has explicitly been defined as
being possibly signed, possibly not, depending on whatever was most
efficient for the specific implementation.  "signed char" provides a
way to force the signed flavor in cases where it matters.

The rules for promotion of types in mixed expressions really have
nothing to do with "signed", since "signed int" means EXACTLY the same
as "int" (for example).  The complicated wording you gave for conversion
rules seems to amount to much the same as in the dpANS, only less precise
and less complete.  You should study the examples in the Rationale
document to see what subtleties are involved.

>[under Grandi's proposal]...
>char by itself may be either, char int and char unsigned are well defined.

But that really is contrary to existing practice, as you unjustly
accused "signed" of being.  Try "char int" on a typical PCC and you
will get an "illegal type combination" error message.

>As to the rest of dpANS, I will not rekindle spent discussions, except
>for a cheap leer at trigraphs :-) :-) (I understand that the committee
>were more or less forced to include them, so it is not wholly their
>fault...).

The party line on trigraphs is that they are primarily meant to aid
C source code interchange between systems with widely varying character
sets, not as something to be used in normal programming practice -- where
other approaches that are more suitable are envisioned (and allowed by
the proposed standard as the first part of translation phase 1).  I'm
frankly not at all certain that the latter point was really widely
appreciated by the committee until fairly recently, but it is now.

>As Ritchie said (more or less) in an interview to Computer Language:
>"dpANS C is the best standard that could have come out of a committee".

I don't think that was intended as "damning with faint praise".  Dennis
has in the past expressed his appreciation of the committee's work.  The
one thing he was unhappy about was the way we had type qualifiers set up
in the second formal public review draft of the standard.  This has been
fixed, with help from Dennis and a few other public reviewers.

pcg@aber-cs.UUCP (Piercarlo Grandi) (12/08/88)

In article <9086@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
    
    You're still inaccurate.  "far" and "near" were never at any time in
    any draft of the proposed ANSI C standard.
    
    "noalias" was not in any draft other than the one sent out for the
    second public review.  How could it be, when it had just been invented
    at the previous meeting and was retracted at the next?

Just a moment! I have just apologized saying that my inaccuracy was to imply
that near and far eventually made it into dpANS.  I acknowledge that they did
not make it there, still at one time (fairly long ago) they were considered
to be going to be part of ANSI C (yes, I know that ANSI C does not yet
exist), and the usual trade interests advertised (they shouldn't have done
it, I know) these as ANSI conforming features of their compiler.  I am
pleased that, along with noalias (that did eventually make it into just one
official dpANS issue) X3J11 had enough sense to avoid them.

    Are you just making this stuff up, or do you have drug-using advisors, or
    what?

Maybe I have seen too many Laurel & Hardy films ["... Look at what you made
me do!"] and I cannot keep them too well distinct from X3J11 work :-) :-).

More seriously, I have been using C for 8-9 years now, and following X3J11
for many years as well. The attention I have devoted to following X3J11 in
later years has not been great, as disappointment set in (volatile?  signed?
reserved word functions? structural equivalence only if different compilation
units? etc...).

I would like also to add that you are right to ask that X3J11 be taken to
account only for what has been perpretated in the latest official, published
version of dpANS, but I am right too to raise again the specter of old issues
that have been discussed quite seriously, as they are part of the full
picture.  When you want to run for President, after all, you know that people
will look at whether you stole cookies when you were twelve :-)...

    >As to the last point, char has been so far just a short short; a char
    >value can be operated upon exactly as an integer.

    Except that whether it acts as signed or unsigned depends on the
    implementation.

Gee, I see you have indeed read the Classic edition of K&R.  Let me be
nitpicking. I said "integer", not "int", and for once I was accurate :-).
You know the meaning of "integer" and "integral".  What I was saying is that
I cannot really see a strong enough difference in the semantics of "char" and
"int/unsigned" to need an "integral" class distinct from "integer"; I think
that the Classic C book can be slightly reinterpreted or amended to make char
belong to "integer" (approximately, just as a modifier on int/unsigned).

The other problem with the Classic C book (that is, apart from distinguishing
between "integer" and "integral"), and you seem to have understood it
correctly :-), is that it only defined "char", whose signedness was
implementation dependent, and "unsigned char".

What I am asking is why X3J11 did not legalize the combination "int char",
hitherto not legal, but accepted by some popular compilers because of an
easily explained benign mistake, to mean "signed char", WITHOUT the
introduction of a new keyword with further complication of the rules for
declarations. I cannot believe they did not think of it...

A related but distinct issue, that fits in nicely with the first, is why it
has not been stipulated that there are two integral types, int/unsigned, with
different arithmetic properties, and three optional lengths for wither of
them, char/short/long, instead of writing up tables of permitted
combinations, which are somewhat more complex, and less clear as to the
fundamental difference in semantics between unsigned and int.

    >Historically char constants have been really the size of integer
    >constants...

    You mean "character constants"; in C they ARE integer constants
    specified in a certain character-oriented way.

Exactly, thnak you for the nitpicking. I used this point to show that
"philosophically" char in C is just a shorter type of integer. This is not
suprising, considering that C is a descendant of BCPL (whose single most
annoying feature is having to use putbyte() and getbyte() for string
manipulation, as it has just one length of integer).

In a sense C is a wonderfully equilibrated mix, BCPL with quite a good lot of
Algol68 thrown in, and this shows thru in things like some semantics
(BCPL-ish) of integer types, and their syntax (Algol68-ish).  I can say this
having studied in depth (several years ago) both Algol68 and BCPL; it is a
pity that so many C programmers don't know either, and miss the pleasure of
contemplating some important threads of history (e.g.  BCPL and Algol68 are
themselves related by way of CPL).

    >Now I reiterate the question: why was a new keyword introduced "signed"
    >when it just sufficed to sanction the existing practice of some
    >compilers (PCC had it, more recent BSD versions fixed this "bug") to
    >say "int char" or better "char int"?

    I have never seen a C compiler that accepted "int char";

Well, you have seen few, I surmise, or you never tried (more likely, I
admit). As I explained, the fact that some (or even several) compilers
do accept "int char" is the result of an easily made mistake in a
particular, but popular, parsing strategy for C declarations.

    certainly Ritchie didn't intend for it to be valid.  Also, char has
    never been guaranteed to be signed; read K&R 1st Edition.

I am pleased that we do agree on something, indeed Ritchie never
intended it to be valid and he did carefully not specify the default
signedness of char; I am also pleased that you have actually read
Classic K&R, and not just the less delectable works from X3J11.

    It happened to be most efficient on the PDP-11 to make it signed, ...

	[ well known list of machines and defaults for char signedness omitted ]

    ... implementation dependence in his BSTJ article on C in 1978.

You even read the BSTJ! My, you must be quite a learned fellow. If so,
you will also know that char is by default unsigned also in some 68K
compilers, while most Intel compilers have it signed. Incidentally, I
have even seen two compilers for the same architecture (68k) implement
a different default! Unfortunately, your precious information (that can
be found, by the way, in a table in any Classic K&R book) is beyond my
my point. Also, I am not entirely surprised/amused at your repeated
assumptions that nobody has bothered to read the Classic C book.

[ By the way, for the benefit of our audience, I will add that many
Classic C and Unix articles from various BSTJs etc... have been
reprinted in a more easily obtained set of two volumes; if I remember
correctly, "Unix Papers" by Academic Press. ]

    >Amusingly it persists even today in other compilers, among them
    >g++ 1.27, where interestingly "sizeof (char int)" is 4 and "sizeof
    >(int char)" is 1 on a 68020...

    I don't know what C++ rules for basic types really are, but if as I
    suspect g++ is getting it wrong, you should report this bug to the
    GNU project.

Well, technically this IS a mistake. On the other hand I am not going
to complain, of course...  (except that I do not like the dissimetry
between "char int" and "int char").  If you had read the full
paragraph, I did say that it is an unintentional "feature", I even
explain why and how this mistake is commonly made by C compiler
writers.


What I am still waiting for, instead of cheap innuendo and showing off
that one had read the Classic K&R (as though nobody else did), is for
somebody to make a good case for:

    introducing the signed keyword and related paraphernalia instead of
    allowing "int char" (an existing unintentional "feature" of some
    compilers, by the way) to do the trick,

    NOT stipulating that there are two fundamental types with very
    different semantics, that can come in four different lengths, and
    therefore having to do with three word long type specifiers, and
    fairly tedious tables of what is permitted, and not emphasizing
    the distinction between int and unsigned.

Note that both things are essentially issues of elegance and easier
comprehensibility, which are damn important in a language like C, and
both can be introduced into the language with essentially a slight
reinterpretation and/or the removal of restrictions of existing rules.
-- 
Piercarlo "Peter" Grandi			INET: pcg@cs.aber.ac.uk
Sw.Eng. Group, Dept. of Computer Science	UUCP: ...!mcvax!ukc!aber-cs!pcg
UCW, Penglais, Aberystwyth, WALES SY23 3BZ (UK)

ok@quintus.uucp (Richard A. O'Keefe) (12/10/88)

In article <347@aber-cs.UUCP> pcg@cs.aber.ac.uk (Piercarlo Grandi) writes:
>This is not
>suprising, considering that C is a descendant of BCPL (whose single most
>annoying feature is having to use putbyte() and getbyte() for string
>manipulation, as it has just one length of integer).

That hasn't been true of BCPL for a long time.  BCPL has two subscripting
operators:
	base!index		index is a word offset, form addresses a word
	base%index		index is a byte offset, form addresses a byte

>In a sense C is a wonderfully equilibrated mix, BCPL with quite a good lot of
>Algol68 thrown in, and this shows thru in things like some semantics
>(BCPL-ish) of integer types, and their syntax (Algol68-ish).

The semantics of C integral types resembles Algol 68 (which has e.g.
int, short int, short short int, long int, long long int) rather than
BCPL, which has only one type "word".  The syntax of C constants
resembles BCPL rather than Algol 68 (e.g. no general "radix" notation,
characters as integer constants rather than char constants, \escapes
-- BCPL uses *escapes, Algol 68 has no escapes at all).

>    introducing the signed keyword and related paraphernalia instead of
>    allowing "int char" (an existing unintentional "feature" of some
>    compilers, by the way) to do the trick,

I will make an argument against this.  "int" does *not* in general have
the meaning "make it signed".  For example, "int unsigned", if accepted,
is not signed!  I would definitely expect that if "int char" or "char int"
were accepted at all, they would be identical to "char" in every respect.
What *would* have been consistent with C's intellectual ancestry, and
*would* suggest signedness, would have been introducing
"short short int" = "signed char" and
"unsigned short short int" = "unsigned char".  But I'm quite sure that
X3J11 considered this and rejected it for good reasons.

gwyn@smoke.BRL.MIL (Doug Gwyn ) (12/10/88)

In article <347@aber-cs.UUCP> pcg@cs.aber.ac.uk (Piercarlo Grandi) writes:
>In article <9086@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>    You're still inaccurate.  "far" and "near" were never at any time in
>    any draft of the proposed ANSI C standard.
>Just a moment! I have just apologized saying that my inaccuracy was to imply
>that near and far eventually made it into dpANS.  I acknowledge that they did
>not make it there, still at one time (fairly long ago) they were considered
>to be going to be part of ANSI C (yes, I know that ANSI C does not yet
>exist), and the usual trade interests advertised (they shouldn't have done
>it, I know) these as ANSI conforming features of their compiler.

I still want to know where you get that idea.  There has never been
any reason to think that "near" and "far" were in any sense related
to ANSI C.  I don't know how much plainer I could make it than in
my sentence cited above.  I would like to see some evidence that
vendors really did advertise their multiple 808x memory-model support
features of their C compilers as in any way related to "ANSI C".

As to your insistence that C integral types should be respecified to
accomplish no more than they current do other than to satisfy your
personal sense of aesthetics, all I can say is that X3J11 did not
make changes to the existing C definition and practice without good
reason, and such a rearrangement would violate both.

Nowhere did I imply that you hadn't read K&R 1st Ed., but it does
seem to me that you never accepted it.  As another poster has
commented, before you suggest what C "should" be, you are obliged
to understand what it is.  You are free to design and promote your
own programming language, but please don't confuse that with C.

pcg@aber-cs.UUCP (Piercarlo Grandi) (12/11/88)

In article <839@quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes:

    In article <347@aber-cs.UUCP> pcg@cs.aber.ac.uk (Piercarlo Grandi) writes:
    >This is not
    >suprising, considering that C is a descendant of BCPL (whose single most
    >annoying feature is having to use putbyte() and getbyte() for string
    >manipulation, as it has just one length of integer).
    
    That hasn't been true of BCPL for a long time.  BCPL has two subscripting
    operators:
    	base!index		index is a word offset, form addresses a word
    	base%index		index is a byte offset, form addresses a byte

Handy syntactic sugaring for putbyte() and getbyte(), that I appreciated for
a short while, as it was still fairly recent when I switched to C... (I did
not use BCPL a lot, after all).

But it was not there at the time C was derived from BCPL! I am persuaded that
Ritchie & Co.  evolved C from BCPL for essentially two reasons, to have
different lengths of word recognized by the compiler, and to have a neater
syntax. BCPL has evolved a little bit... Still, you cannot define a byte
array in BCPL; this would go against the very fundamentals of the language.
Richards wrote that the easy portability of BCPL was largely based on having
the code generator deal with a single type.

    >In a sense C is a wonderfully equilibrated mix, BCPL with quite a good lot of
    >Algol68 thrown in, and this shows thru in things like some semantics
    >(BCPL-ish) of integer types, and their syntax (Algol68-ish).
    
    The semantics of C integral types resembles Algol 68 (which has e.g.
    int, short int, short short int, long int, long long int) rather than
    BCPL, which has only one type "word".

To me that is syntax. The semantics is that functionally C types are still
essentially "words", albeit of different lengths (to me, lengths do not change
the semantics of operations, and thus do not really introduce new "types").
Unsigned was a significant departure, especially in that it was defined
to obey the rules of modular arithmetic.

    The syntax of C constants resembles BCPL rather than Algol 68 (e.g. no
    general "radix" notation, characters as integer constants rather than
    char constants, \escapes -- BCPL uses *escapes, Algol 68 has no escapes
    at all).

Indeed, indeed; exactly what I meant. Apparently BCPL, going into B, and then
early C, remained quite BCPL-ish; on one one clearly "struct" was taken from
Algol68, but the fact that members of structs were essentially named offsets,
with no (in)visibility rules, was easily a way to transpose BCPL's
"pointer!named.int.constant" into "pointer->field_name", which is arguably
nicer than the literal translation "pointer[named_int_constant]". Of course
there are a lot of other details... Enough for now of these.

Eventually Bourne & Co. (I surmise) did bring a lot more Algol68 lore
(actually, amusingly, Algol68*C* -- for Cambridge) into Bell Labs., and C and
Unix.  As a sign of this, I was always greatly amused that in the released V7
adb $a was described as "Algol68 stack backtrace", even if the Algol68
compiler (probably a derivative of Algol68C, an excellent piece of
engineering) was never released to the general public... (at least not to me,
unfortunately!).

    >    introducing the signed keyword and related paraphernalia instead of
    >    allowing "int char" (an existing unintentional "feature" of some
    >    compilers, by the way) to do the trick,

    I will make an argument against this.  "int" does *not* in general have
    the meaning "make it signed".

Yes, Yes. But it could be construed to... Actually I would not like,
as you have understood, to have it have that meaning. I would rather
interpret "char int" mean "short short int" than "int char" mean
"signed char"...

    For example, "int unsigned", if accepted, is not signed!

Yes according to existing rules; but "unsigned int" (and "signed int"), are
exactly what I am trying to make obsolete!

In a sense you have spotted the weak point of my argument; if a declaration
were to be built of a length modifier and a base type (both optional), then
"unsigned int" would be illegal (two base types!), against existing common
practice. It could however be declared obsolescent and allowed as a special
case, which admittedly is ugly, but virtually painless.

    I would definitely expect that if "int char" or "char int" were accepted
    at all, they would be identical to "char" in every respect.

Yes, with some caveats, in the dpANS framework.  In my framework, char would
be a modifier, and unsigned/int base types. If the base type were omitted by
the programmer, any of the two base types could be defaulted by the
implementation, as currently is done. If not, "char int" would have to be
signed, and "char unsigned" not.

    What *would* have been consistent with C's intellectual ancestry, and
    *would* suggest signedness, would have been introducing "short short int"
    = "signed char" and "unsigned short short int" = "unsigned char".

Yes, again, except that I'd rather have "short short unsigned" mean "unsigned
char". I think that indeed one problem with Algol68 is that there is no
notion of unsigned. Since (in C, at least), unsigned behaves differently from
int, it ought to be regarded as a different base type to which apply the same
length modifiers as int.

    But I'm quite sure that X3J11 considered this and rejected it for good
    reasons.

Essentially that "short short" is superfluous, as "char" in practice is being
used for that. In that I agree, after all C is not Algol68.

As I have indicated, however, I'd rather dispose of Algol68 like length
indicators, except as an obsolescent feature; instead of wasting a keyword on
"signed", I'd rather waste it on "range" or whatever, and let the compiler
figure the appropriate number of bits.

As a more C-ish, and less radical alternative, I'd extend the bit field
notation to ordinary declarations. Let me quote from a reply I sent (no, I am
not yet like Prof. Dijkstra in quoting only my own works, diary and letters
:->) to somebody making points similar to yours:

    But with the current scheme I find myself doing things like
       
	#ifdef pdp11
	# define bit8 char
	# define bits16 int
	# define bits32 long
	#endif
	#ifdef vax
	# define bit8 char
	# define bits16 short
	# define bits32 int
	#endif

    (note use of #define and not typedef because I want to be able to say
    things like "bits8 unsigned") and then, as a consequence,

	typedef bits8 ascii;
	typedef bits16 procid;
	typedefs bits32 dollars;

    The first step is useless and circuitous, and less portable, as you have to
    have explicitly as many cases as you have machine types and compilers; I'd
    rather say:

	typedef unsigned ascii : 7;
	typedef int procid : 16;
	typedef int dollars : 32;

THE END, FINALLY!

Now for some meta-discourse.

I thank you for your civil reply.

I also have another reasons to thank you.

Evidently I have not been able to communicate to Mr Wells and Mr. Gwin that I
do know the existing language in the Classic C book by K&R, and the ones in
dpANS C, even if I find it less brilliant :-), and even if has been a (now
nearly fixed) moving target.

Evidently I have not been able to make them understand that I was trying to
show that with a little definitional legerdemain, for which there could be
some justification in existing or old compiler bugs, or in looking at Classic
C with a jaundiced, but historically justified, attitude, some potentially
confusing, and needless, X3J11 decisions could have been avoided, and the
Classic C syntax and pragmatics be made even simpler and more symmetric, at
virtually no cost in breaking existing programs.

A few people that have sent me msgs by email have penetrated my admittedly
somewhat heavvvvvy prose, and have understood as much, whether agreeing
(mostly) or disagreeing with me (like you).

I thank you for posting a reply that demonstrates to our audience, and not
to me alone, that somebody can understand the points I make, and address
them, instead of confusing my inability to express myself in a way
palatable to themselves with something else.
-- 
Piercarlo "Peter" Grandi			INET: pcg@cs.aber.ac.uk
Sw.Eng. Group, Dept. of Computer Science	UUCP: ...!mcvax!ukc!aber-cs!pcg
UCW, Penglais, Aberystwyth, WALES SY23 3BZ (UK)

henry@utzoo.uucp (Henry Spencer) (12/13/88)

In article <347@aber-cs.UUCP> pcg@cs.aber.ac.uk (Piercarlo Grandi) writes:
>... I have just apologized saying that my inaccuracy was to imply
>that near and far eventually made it into dpANS.  I acknowledge that they did
>not make it there, still at one time (fairly long ago) they were considered
>to be going to be part of ANSI C...

I think you were misled by sleazy advertising; as far as I know, this was
never intended or even seriously considered.  Claims to the contrary should
be justified with references, please.
-- 
SunOSish, adj:  requiring      |     Henry Spencer at U of Toronto Zoology
32-bit bug numbers.            | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

gwyn@smoke.BRL.MIL (Doug Gwyn ) (12/14/88)

In article <375@aber-cs.UUCP> pcg@cs.aber.ac.uk (Piercarlo Grandi) writes:
>    [B] I have made it up.
>    [B] I have never read/understood the Reference Manual.
>    [B] My advisor (if I had one) must be on drugs.

The first and third of these refer to an inquiry I made about your
claims that "near" and "far" were in ANSI C and that "noalias" had
been in many drafts of the proposed ANSI C standard.  I don't know
why you're suggesting that they had been applied to your "char int"
notion.  I don't remember anyone other than you suggesting the
second.  At least we finally got you to admit that your notion
differs from the description in K&R 1st Ed.

"signed" was adopted by X3J11, from already-existing practice, in
order to remedy a deficiency in K&R 1st Ed. C regarding char types.
"char int" as you propose it is not common practice.  X3J11 is not
inventing a new language, but rather canonicalizing the one in use
(inventing only when necessary to repair perceived significant
problems).  You may not like C's lack of orthogonality in its type
scheme, but it happens to be the way the C language actually is.
Feel free to fix this when you invent a new programming language.

ray@micomvax.UUCP (Ray Dunn) (12/16/88)

In article <347@aber-cs.UUCP> pcg@cs.aber.ac.uk (Piercarlo Grandi) writes:
>
>What I am asking is why X3J11 did not legalize the combination "int char",
>hitherto not legal, but accepted by some popular compilers because of an
>easily explained benign mistake, to mean "signed char", WITHOUT the
>introduction of a new keyword with further complication of the rules for
>declarations. I cannot believe they did not think of it...
>

Introducing "signed" certainly adds another keyword to the language, just as
it certainly does not introduce a new concept.

Overloading the keyword "int" with a new meaning is, to me, both ad-hoc and
unfortunate.  The logic appears to be: ints are signed, so int can be used
to *imply* signed.  Hmm.  Dog's breathe so everything that breathes is a
dog?

How much more "further complication of the rules for declarations" are involved
in the introduction of "signed" than allowing "int char"?

Why shouldn't "int char" also imply something about the *size* of the char?
Yes, I know, because you have made the ad-hoc definition that it shouldn't,
because to you (alone?) "int" implies something about signedness.

The fundamental hole in your approach is saying that the two types we are
talking about are "int" and "unsigned".  If you want to argue from that
angle, then you should be saying "int" and "unsigned int".  As a default
shorthand convenience (only) we are allowed to not specify the "int"
following "unsigned".  Using this same concept, we are allowed to omit the
"signed" when it precedes "int" etc.  The only confusion is an *historical*
confusion, i.e. what is the default signedness of "char"?

All of the above assumes of course that we are discussing 'C' with its large
body of existing code which must continue to be supported, and not some
as-yet-to-be-invented language that is still just a wish-list in your mind.
If it is the latter, then of course, everything you say could be possible!

-- 
Ray Dunn.                      |   UUCP: ..!philabs!micomvax!ray
Philips Electronics Ltd.       |   TEL : (514) 744-8200   Ext: 2347
600 Dr Frederik Philips Blvd   |   FAX : (514) 744-6455
St Laurent. Quebec.  H4M 2S9   |   TLX : 05-824090

pcg@aber-cs.UUCP (Piercarlo Grandi) (12/18/88)

In article <1988Dec12.172658.29898@utzoo.uucp> henry@utzoo.uucp (Henry
Spencer) writes:

	[ on far and near in ANSI C ]

    I think you were misled by sleazy advertising; as far as I know, this was
    never intended or even seriously considered.

I am beginning to think that I was indeed. Never too late... ;-}. Sorry for
all the disturbance caused on this particular issue.

    Claims to the contrary should be justified with references, please.

You must be kidding! Justifying claims with references? in this discussion? ;-}

Please, pleaaase, I have not made an issue of this near and far business,
I have been so busy just sticking to signed and volatile...

Consider it buried under all the egg I got on my face because of it? Let's
have a deal, everybody stop mentioning near and far and I won't bother
discussing trigraphs? :-} :->.
-- 
Piercarlo "Peter" Grandi			INET: pcg@cs.aber.ac.uk
Sw.Eng. Group, Dept. of Computer Science	UUCP: ...!mcvax!ukc!aber-cs!pcg
UCW, Penglais, Aberystwyth, WALES SY23 3BZ (UK)

bill@twwells.uucp (T. William Wells) (12/18/88)

In article <371@aber-cs.UUCP> pcg@cs.aber.ac.uk (Piercarlo Grandi) writes:
: To me that is syntax. The semantics is that functionally C types are still
: essentially "words", albeit of different lengths (to me, lengths do not change
: the semantics of operations, and thus do not really introduce new "types").
: Unsigned was a significant departure, especially in that it was defined
: to obey the rules of modular arithmetic.

This is a far better statement about what your intention and desire
is than any of your previous postings on the subject. Your previous
postings have adduced falsehoods to support what is, after all, just
an expression of your intuitions about the language. It is this
adducing, not your ideas themselves, which have caused several of us
to liberally flame you.

Now, I'll be the first to agree that C's types are kludgey.  The
notion of a type where the sign is unspecified is more than a bit
wierd. However, there it is, in the language; we have to live with it.
X3J11, bowing to the desire to have a definitely signed character, did
something about it.

That something should have been done about it, I think we both agree
on. However, where you disagree with most everyone else is in how it
ought to be done.  The committee decided that the way to solve the
problem was to add a new keyword. Your suggestion is to use `char
int' instead.

Your method has the drawback of either requiring every C program in
existence to be rewritten or requiring that programmers memorize a
wholly nonintuitive type table.

Let's draw up a table:

char                 char               char
signed char          char int           char int
unsigned char        char unsigned      unsigned char
short (int)          short int          short (int)
unsigned short (int) short unsigned     unsigned short (int)
int                  int                int
unsigned (int)       unsigned           unsigned (int)
long (int)           long int           long (int)
unsigned long (int)  long unsigned      unsigned long (int)

These are the columns:

	1) Existing practice, either in common use, or the dpANS
	   `signed char', the wart you aspire to remove.

	2) A type scheme using using int and unsigned as the base
	   types, char, short, and long as modifiers.  This looks
	   pretty, except for the unavoidable `char' wart.

	3) A fudged version of the existing method, substituting char
	   int for signed int. The wart in this is its incoherency.

Choice 1 is existing practice, to be preferred unless there is a good
reason to change it. Its primary drawback is that the use of signed
will break any program using `signed' as an identifier.  (Grep will
fix that, however.)

Choice 2 requires modification of almost every existing program.
These are the changes that one would have to make:

	short                   short int
	unsigned short int      short unsigned
	unsigned int            unsigned
	long                    long int
	unsigned long int       unsigned long

Now, everyone who routinely leaves out the unnecessary words in a type
is going to be stuck fixing `short' and `long'. Those who routinely
put them in will have to remove them for `unsigned short int',
`unsigned int', and `unsigned long int'. Those who don't have any
habits either way will be left totally at C.

In other words, this change would screw over every C programmer.

Choice 3 does not require the extra keyword as choice 1 does;
however, the type scheme is incoherent. There is no way, other than
memorizing the wierd syntax for `char' types, that a programmer can
use this.

So, in summary, your idea is fundamentally flawed: if carried out
consistently will break almost every program, if carried out
inconsistently will add to the confusion about C types.

Let's kill this discussion. Your feelings on the subject are
irrelevant; the "facts" you have presented to back up your position
have been false. When you do this kind of thing, all you do is damage
your reputation and waste bandwidth.

---
Bill
{uunet|novavax}!proxftl!twwells!bill

pcg@aber-cs.UUCP (Piercarlo Grandi) (12/20/88)

In article <254@twwells.uucp> bill@twwells.UUCP (T. William Wells) writes:

#    This is a far better statement about what your intention and desire
#    is than any of your previous postings on the subject.

Bizarrely enough somebody else did understand them. Ah, by the way, hadn't
you promised to ignore my further postings? :->. I don't complain, though, you
are an almost ideal illustration of my point about the confusion the current
syntax/semantics gap generates :->.

#    Your previous postings have adduced falsehoods to support what is, after
#    all, just

Falsehoods: what a big word.  You cannot just disagree with my arguments, you
call them lies. I could be proud of this... ;-{

    I realize that at this point most readers will be already fairly
    disgusted, and I want to point out that I am sickened myself a bit by
    these sweeping, very kind allegations.

    Most readers will not want to deal with any more of this, and will not
    need to see the technical arguments I am going to use to try to make Mr.
    Wells understand my points (lies, if you wish :->).

    You have an opportunity to type n.

#    Your previous postings have adduced falsehoods to support what is, after
#    all, just an expression of your intuitions about the language.

Excellent: start by calling liars those that disagree with you, then you will
attack them for what they have never said. Have you ever thought of getting
into politics? :-) :-) :-)

Lies?? Maybe I am too thick as you say, but somebody else may start to think
that your allegations so far have been misrepresentations of my critique of
dpANS C or maybe are just the result of your lack of basic understanding of
such critique.

I have tried to persuade you that I have read Ritchie's work, that it is
sufficiently precise in some points and ambiguous in others (save for one
critical point) to support my interpretation. Isn't that enough?

I refuse to be faulted for your problems with your ingrained habits.
You may not agree with my arguments, but they happen to be well
founded, even if unconventional.

#   It is this adducing, not your ideas themselves, which have caused several
#   of us to liberally flame you.

No, it is your inability to think in terms other than those you are familiar
with.

#   Now, I'll be the first to agree that C's types are kludgey.  The
#   notion of a type where the sign is unspecified is more than a bit
#   wierd.

Indeed, that is exactly WHY I say "char" should be described as a *length*
and not a type; its base type may be either "int" or "unsigned" if not
specified explicitly.  There is no weirdness in this, just a matter of
repecting existing conventions.

#    However, there it is, in the language; we have to live with it.
#    X3J11, bowing to the desire to have a definitely signed character, did
#    something about it.

Still I would have liked X3J11 to have simplified matters as much as possible,
instead of adopting as a rule the easy and more complicated solution.

#    That something should have been done about it, I think we both agree
#    on. However, where you disagree with most everyone else is in how it
#    ought to be done.  The committee decided that the way to solve the
#    problem was to add a new keyword. Your suggestion is to use `char
#    int' instead.

Not really, you have missed that 'char int' was a mere illustration of the
straightening out the syntax. By itself it *is* just a wart, and I have never
claimed otherwise; I have used it only to introduce the argument that it is
the description of the syntax of integral types that ought to be
restructured, and as a nice side effect this solves the signed char problem
without requiring keywords.

#    Your method has the drawback of either requiring every C program in
#    existence to be rewritten or requiring that programmers memorize a
#    wholly nonintuitive type table.

This is entirely bogus. Please try to understand what I write. My proposal
*is* backwards compatible and has a very simple type table.

#    Let's draw up a table:
#
#    char                 char               char
#    signed char          char int           char int
#    unsigned char        char unsigned      unsigned char
#    short (int)          short int          short (int)
#    unsigned short (int) short unsigned     unsigned short (int)
#    int                  int                int
#    unsigned (int)       unsigned           unsigned (int)
#    long (int)           long int           long (int)
#    unsigned long (int)  long unsigned      unsigned long (int)
#
#    These are the columns:
#
#	    1) Existing practice, either in common use, or the dpANS
#	       `signed char', the wart you aspire to remove.
#
#	    2) A type scheme using using int and unsigned as the base
#	       types, char, short, and long as modifiers.  This looks
#	       pretty, except for the unavoidable `char' wart.
#
#	    3) A fudged version of the existing method, substituting char
#	       int for signed int. The wart in this is its incoherency.
#
#    Choice 1 is existing practice, to be preferred unless there is a good
#    reason to change it. Its primary drawback is that the use of signed
#    will break any program using `signed' as an identifier.  (Grep will
#    fix that, however.)
#
#    Choice 2 requires modification of almost every existing program.

Absolutely not. Did I really have to say, to intelligent readers, that as an
obsolescent feature, "unsigned int" ought to be allowed to stand  as a synonym
of "unsigned", because that is the only glaring incompatibility?

Also, I never advocated deleting the existing rule that if only the length
modifier is given in a declaration, the base type is int, except for
char where it may be either unsigned or not.

My, I had tried to sketch an alternative, in the interests of conciseness.  I
didn't want to have to write ten page articles to cover all points that you
may have difficulties with.

As to my concise type tables, here they are:

[Classic C]		(unsigned) char
			(unsigned) (short|long) int

[dpANS C]		(signed|unsigned) char
			(signed|unsigned) (short|long) int

[My proposal]		(char|short|long) unsigned
			(char|short|long) int

With the obvious defaults as applicable to each case.

#    These are the changes that one would have to make:
#
#	    short                   short int
#	    unsigned short int      short unsigned
#	    unsigned int            unsigned
#	    long                    long int
#	    unsigned long int       unsigned long
#
#    Now, everyone who routinely leaves out the unnecessary words in a type
#    is going to be stuck fixing `short' and `long'. Those who routinely
#    put them in will have to remove them for `unsigned short int',
#    `unsigned int', and `unsigned long int'. Those who don't have any
#    habits either way will be left totally at C.
#
#    In other words, this change would screw over every C programmer.

Absolutely not! Again. Did I have to spell out that the usual defaults would
still apply? Did I have to spell out that the (rightly) obsolescent rule that
allows type keywords to appear in "any" order would still have to exist? Did
I have ever write anything to the contrary? Shouldn't I have expected you
to fill in the obviously missing details?

#    Choice 3 does not require the extra keyword as choice 1 does; however,
#    the type scheme is incoherent. There is no way, other than memorizing the
#    wierd syntax for `char' types, that a programmer can use this.
#
#    So, in summary, your idea is fundamentally flawed: if carried out
#    consistently will break almost every program, if carried out
#    inconsistently will add to the confusion about C types.

In summary, your understanding has improved, but you are still fighting with
things I have never written, and you fault me in fairly abusive terms for
ideas that you have invented yourself.

#    Let's kill this discussion. Your feelings on the subject are irrelevant;
#    the "facts" you have presented to back up your position have been false.

I have not presented any fact, only reasonable arguments, including the
arguable point that you are so highly prejudiced that you haven't yet been
able to really understand what I have been talking about, a very soft,
clearer and simpler, reinterpretation of existing keywords and syntax.

#    When you do this kind of thing, all you do is damage your reputation and
#    waste bandwidth.

You may not realize that the only reputation you are really damaging is
*yours*; maybe this isn't even a waste of bandwidth.
-- 
Piercarlo "Peter" Grandi			INET: pcg@cs.aber.ac.uk
Sw.Eng. Group, Dept. of Computer Science	UUCP: ...!mcvax!ukc!aber-cs!pcg
UCW, Penglais, Aberystwyth, WALES SY23 3BZ (UK)

bill@twwells.uucp (T. William Wells) (12/23/88)

In article <440@aber-cs.UUCP> pcg@cs.aber.ac.uk (Piercarlo Grandi) writes:
: #    Your previous postings have adduced falsehoods to support what is, after
: #    all, just
:
: Falsehoods: what a big word.  You cannot just disagree with my arguments, you
: call them lies. I could be proud of this... ;-{

A deliberate word. If I wanted to call them lies, I would. A falsehood
is just that: something not true. I don't know why you said them. It
could be that you are repeating rumors. Or that you have formed your
conclusions without reasons. Or that you have a bad memory. Or that
you misunderstood what you have read. Or that you have been mislead
by someone else. Or that you were on drugs. Or that you invent things
and convince yourself of the reality of your invention; I have a
brother with this almost-psychosis. Or something else. I don't know.
What I do know is that many "facts" that you have supplied just
aren't so.

: I have tried to persuade you that I have read Ritchie's work, that it is
: sufficiently precise in some points and ambiguous in others (save for one
: critical point) to support my interpretation. Isn't that enough?

Well, to repeat what damn near everyone else has said: your
interpretation of K&R is inconsistent with every informed person's
interpretation of K&R. Not only that, but it is inconsistent with C
practice. So, what we have seen is that your interpretation is
essentially word games and logic chopping; thus your contentions are
not supported, no matter what you may wish.

: [Classic C]           (unsigned) char
:                       (unsigned) (short|long) int
:
: [dpANS C]             (signed|unsigned) char
:                       (signed|unsigned) (short|long) int
:
: [My proposal]         (char|short|long) unsigned
:                       (char|short|long) int

Ok, I see that I missed something in your postings.  And, BTW, you
could have saved yourself lots of typing had you stopped at this
point. Having shown that my premise was incorrect, refuting the rest
of my posting was a waste of time.

Now, back to tables.

dpANS C                 Your proposal           comment

char                    char
signed char             char int                different
unsigned char           unsigned char
short (int)             short (int)
unsigned short (int)    unsigned short          unsigned short int is illegal
int                     int
unsigned (int)          unsigned                unsigned int is illegal
long (int)              long (int)
unsigned long (int)     unsigned long           unsigned long int is illegal

OK, this is not outrageously bad. But:

1) It requires a complete reorganization of the way that C programmers
   think. And no, you can't say that C programmers should just
   (re)learn thinking that way. Older compilers will exist for a long
   time; that requires that both be understood. If you think the
   current system is confusing, consider the situation with two
   different systems!

2) It obsoletes existing code. Admittedly, these are not quiet
   changes. So does `signed'. And it is not a quiet change either.
   Note, however, that there are a lot more programs that use
   unsigned short, unsigned int, and unsigned long than there are
   programs that use the word `signed'. Evidence? Scanning several
   megabytes of source code, finding not one instance of `signed' and
   many hundreds of `unsigned int'.

3) To avoid further obsoleting programs, you have to have elision
   rules that permit long and short to stand by themselves. Another
   wart.

In other words, not only would your change have the same drawback as
the signed keyword, it would have the additional drawback of creating
a long period of confusion.

And it's advantages? Diddly. There is no positive change to be had.
Oh, you might say that your type scheme is simpler, but that is a
personal judgement. Me, all I see is a different set of warts.

---
Bill
{uunet|novavax}!proxftl!twwells!bill