[mod.std.c] mod.std.c Digest V3 #7

osd7@homxa.UUCP (Orlando Sotomayor-Diaz) (02/22/85)
mod.std.c Digest            Fri, 22 Feb 85       Volume 3 : Issue   7 

Today's Topics:
                               Case >20
                  Comments on Standard, section C.1
----------------------------------------------------------------------

Date: 21 Feb 85 11:54:50 EST (Thu)
From: Craig Partridge <cbosgd!ucbvax!craig@loki.ARPA>
Subject: Case >20
To: cbosgd!std-c@BERKELEY

Kevin Martin writes to suggest adding comparative operators into
case statements.  I'd like to point out that at least

	case >20: -----

is already handled by the language -- the appropriate keyword
is "default".

I also think that with ranges in the case plus "default" you
can build a case statement that does just about any comparison
you want without much difficult.  I view case <=20:, case >=20:
etc. as unnecessary syntatic sugar.

Craig Partridge
craig@bbn-loki (ARPA)
craig%loki@csnet-relay (CSNET)
ihnp4!bbncca!craig (USENET)

------------------------------

Date: Thu, 21 Feb 85 15:12:26 pst
From: cbosgd!ucbvax!ucsfcgl!arnold (Ken Arnold)
Subject: Comments on Standard, section C.1
To: cbosgd!std-c@BERKELEY

*Problem:	Encourages potential identifier conflict
*Reference:	C.1.1. Keywords (p. 14)
*Description:

The keyword list does not include "asm" or "fortran", although these
are listed in E.5.4 (Common extensions) as keywords.  I am not arguing
that these features should be part of the language, but I think it
would avoid some coding problems in the future if these keywords were
listed as reserved, even though the features are not a required part of
the standard.  I can just see some novice user using one of the two,
most likely "asm", in code which would have otherwise been fully
portable.

At the very least, C.1.1. should have

	Forward references: the "fortran" keyword (Sec. E.5.4.4), the
	"asm" keyword (Sec. E.5.4.5).

*Problem:	Breaks Code, Confusing Syntax, Unclear
*Reference:	C.1.2 Identifiers; Implementation Limits (p. 14)
*Description:

	The implementation must treat at least the first 31 characters
	of an *internal name* (an identifier that does not have
	external linkage -- described below) as significant.
	Corresponding lower-case and upper-case letters are different.
	The implementation may further restrict the significance of an
	*external name* (an identifier that has external linkage) to
	six characters and may ignore distinctions of alphabetical case
	for such names.  These characteristics are all implementation
	defined.

*Unclear:

Which characteristics are specifically referred to by the last
sentence?  All the ones described in the paragraph?  All the ones in
the preceding sentence?  I presume the latter, but it should be clear.

*Breaks Code:

The PDP-11/70 (Ritchie) compiler allowed seven significant characters.
I realize that this is not a standard, but much code has been written
with this as the minimal assumption.  (I'll be kind and not even
mention the code written under the many compilers which set no limit,
or a large (usually 31 or 32 character) limit.)  If one is asking
someone to rewrite a compiler (and many of the extensions would require
some extensive modifications to existing compilers), asking them to
modify a loader is not too much to add.  Having written a compiler,
assembler, and linker/loader, I think I am not speaking in the dark
here.  This is primarily a argument of relative effort.  To properly
implement "const" and "volatile", for example, including keeping the
peephole optimizers away from stuff they shouldn't touch, is probably
at a similar level of effort.

In any case, this will break existing programs written to run under
PDP-11's or above.  I suspect that case insignificance will do so to,
but solving this problem on non-ascii machines is decidedly non-
trivial.  In a future letter I will make the case for the antithesis of
"common extensions", which is "allowable deletions", and I would argue
the lack of case distinction should be under the latter category.

*Confusing Syntax:

Generally, giving externally and internally linkaged (ick, what a
word!) identifiers different syntax is confusing and ugly, and appears
arbitrary (even if it isn't), thus making the language more difficult
to learn and debug.

A suggested wording:

	The implementation must treat at least the first 31 characters
	of identifiers as significant.  Corresponding lower-case and
	upper-case letters are different.  For an *internal name* (an
	identifier that does not have external linkage -- described
	below), these characteristics must always hold.  For an
	*external name* (an identifier that has external linkage), it
	is an allowable deletion [see below -- Ken] not to have case
	distinctions on machines where this is sufficiently difficult.

Just to give you a taste (and to clear this up and this avoid some
discussion), my concept of an allowable deletion is something which is
recognized to be very difficult or impossible on certain systems, and
which the implementor is therefore allowed to punt, as long as (a) this
is documented up front (i.e., not hidden in some implementation
document that no one considering buying the compiler would read); and
(b) a reasonable warning is generated by the compiler.  In this case,
for example, the compiler could say something like

	"warning: case not significant in external variables"

the first time it encountered a global variable with upper case letters
(or mixed case variables, or something) and then shut up for the rest
of the compilation.

*Problem:	Suggested extension
*Reference:	C.1.3.4  Character Constants (p. 19)
*Description:

Add '\e' for '\033'; see comment on section B.3.2 Character Display
Semantics

*Problem:	Unjustified new feature
*Reference:	C.1.4 String literals; Syntax & Semantics (p. 20)
*Description:

	Syntax:
		string-literal:
			unit-string-literal
			string-literal  unit-string-literal

		...

	Semantics:

	A string literal has static storage class and type 'array of
	char,' and is initialized with the given characters.  The
	implementation concatenates unit string literals that are
	adjacent tokens into a single string literal.  It then appends
	a NUL character '\0' at the end.

This is a new feature.  It basically says that

	"hi" "there"

is equivalent to

	"hithere"

Now, the real question: Why?  What does adding this feature to the
language accomplish?  It just seems pointless to me, and adding
pointless features is not a good thing to do, for countless reasons I
will not enumerate unless called upon to do so.

------------------------------

End of mod.std.c Digest - Fri, 22 Feb 85 07:31:24 EST
******************************

USENET -> posting only through cbosgd!std-c.
ARPA -> replies to cbosgd!std-c@BERKELEY.ARPA (NOT to INFO-C)
In all cases, you may also reply to the author(s) above.