[comp.lang.c] Reserved words in C

dant@tekla.tek.com (Dan Tilque;1893;92-789;LP=A;60jB) (12/20/86)

In article <1524@hoptoad.uucp> gnu@hoptoad.uucp (John Gilmore) writes:
>
>In a slightly related area, I second Martin Minow's request for a list
>of all the predefined words ("keywords" and library routines and
>#define's) in ANSI C.  I suspect that if people saw all the new words
>in one place, they would chop back the list, or move those hundreds of
>words into the "prefix _" category.  Since the standards folks seem
>disinclined, anybody feel like reading the whole text and compiling the
>list?  (gee, sounds like another job for machine readable text...)
>

I'm fairly new to using C but one of the things I liked about it
was the small number of reserved words in the language.  I used to program
in COBOL (#include standard_retch.h) a language noted for having many
reserved words.  (Between the ANSI, Codasl and IBM versions of the language
there were about 500 reserved words with more possibly being added
in the COBOL 8X version.  This did not include library routines, these
were all keywords.)

When I first learned C, there were about 80 or so words I needed to
learn (reserved words, preprocessor commands, frequently used routine
names and some miscellaneous).  I didn't have to learn the library
routine names or #defines in headers I never used. 

Is this going to change?  Is C going to become like COBOL?  I certainly
hope not.


 Dan Tilque				dant@tekla.tek.com

karl@haddock.UUCP (Karl Heuer) (12/25/86)

In article <1016@zeus.UUCP> dant@tekla.tek.com (Dan Tilque) writes:
>I'm fairly new to using C but one of the things I liked about it was the
>small number of reserved words in the language. ... Is this going to change?
>Is C going to become like COBOL?  I certainly hope not.

Even in pre-ANSI C, the problem exists:

[a]  A naive user includes <stdio.h> to declare a FILE.  He doesn't know
     anything about "NULL", and tries to use that name as a local variable:
	register int NULL;	/* NUmber of Long Lines */
     This fails because NULL is a macro defined in that header file.  Hence
     its name must be considered "reserved", even though it's not known to the
     compiler proper.

[b]  A user who has read all the ANSI documents tries to write the following
     (legal) program:
	void write(s) char *s; { printf("%s\n", s); }
	main() { write("Hello, world\n"); return 0; }
     This will fail under most current implementations, because there is a
     library routine call write() which, although not mentioned by the user,
     is nevertheless invoked indirectly through printf().  Thus, every name in
     the standard library must be considered reserved, whether used or not.
     (Note that for this particular example, "write" is *not* part of the ANSI
     standard, so a strictly conforming implementation must not have printf()
     call write().  It may, however, have printf() call _write(), and provide
     a function named write() which also calls _write().)

This is a tough problem in general, and I don't know what the answer is.  (One
approach is the VMS-like notion of having a prefix on everything reserved:
"sys$write()", "mth$cos()", etc.  Ugly, but perhaps necessary.)

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

gwyn@brl-smoke.ARPA (Doug Gwyn ) (01/05/87)

In article <1016@zeus.UUCP> dant@tekla.tek.com (Dan Tilque) writes:
>When I first learned C, there were about 80 or so words I needed to
>learn (reserved words, preprocessor commands, frequently used routine
>names and some miscellaneous).  I didn't have to learn the library
>routine names or #defines in headers I never used. 

In a hosted environment, one's code is normally linked against
a "standard C library" that contains entry points for routines
such as fflush() etc.  Because some library routines might
invoke others (unbeknownst to the user), it has always been
unwise for the user to provide his own externally-visible
functions with the same names as any library extern, whether
or not he thought it wasn't being used.  X3J11 hasn't done
anything new in this regard except to standardize a subset of
the C library and to introduce a handful of new externs/macros.

Please note that a conforming implementation is not permitted
to have any extensions that could alter the behavior of a
strictly conforming program; one implication of this is that
non-X3J11 entry points starting with anything other than _ must
NOT be present in the implementation's standard C library.
This isn't as bad as it sounds, since for example the UNIX "cc"
could require use of a flag to obtain a conforming environment
(the best way to implement this might be to provide the pure
X3J11 environment as a separate-but-equal library that invokes
system calls using names such as _read() instead of read()).
I'm not sure that's what X3J11 intended, but it's what I deduce
from section 1.6 of the Draft Proposed Standard; however, one
wonders why "additional library functions" are mentioned if
they would be so hard to provide (basically, there would have
to be a header file that redefined the externs as _-names).

faustus@ucbcad.berkeley.edu (Wayne A. Christopher) (01/06/87)

In article <5476@brl-smoke.ARPA>, gwyn@brl-smoke.ARPA (Doug Gwyn ) writes:
> Please note that a conforming implementation is not permitted
> to have any extensions that could alter the behavior of a
> strictly conforming program; one implication of this is that
> non-X3J11 entry points starting with anything other than _ must
> NOT be present in the implementation's standard C library.

If a standard library has a function called 'read' in it, I don't see how
this could alter the *behavior of the program*, although it may cause a 
program which would not otherwise compile to compile (i.e, the program
uses a function called 'read' that it doesn't define).  This is assuming
that the the library routines that want to use read call the function _read
instead.  I guess a "strictly conforming" flag would be useful in a
UNIX compiler though...

	Wayne

gwyn@brl-smoke.ARPA (Doug Gwyn ) (01/06/87)

In article <1199@ucbcad.berkeley.edu> faustus@ucbcad.berkeley.edu (Wayne A. Christopher) writes:
>If a standard library has a function called 'read' in it, I don't see how
>this could alter the *behavior of the program* ...

/*	Demonstrate problem with extensions in the C library	*/

#include <stdio.h>

extern int	read();

int
main()	{
	int	c = read();

	(void) printf( "Got: 0%o\n", c );
	return 0;
	}

/*	The following might be in a separate file, so that
	external linkage will be forced to occur as expected	*/
int
read()	{
	return getchar();
	}

On a system (such as UNIX) where getchar() is implemented via an
eventual call to a function named read() in the C library, this
program will go bonkers, even though it is strictly conforming.

rbutterworth@watmath.UUCP (Ray Butterworth) (01/06/87)

In article <5476@brl-smoke.ARPA>, gwyn@brl-smoke.ARPA (Doug Gwyn ) writes:
> I'm not sure that's what X3J11 intended, but it's what I deduce
> from section 1.6 of the Draft Proposed Standard; however, one
> wonders why "additional library functions" are mentioned if
> they would be so hard to provide (basically, there would have
> to be a header file that redefined the externs as _-names).

The solution used by the C (and B and Pascal) library maintained
here is to have two names for every external (trivial when the
source is in assembler, possibly more difficult if it's in C
(maybe that would be a good use for the "entry" keyword?)).
All library functions that call other library functions do so
by their "other" name.  Thus a user program could define functions
called printf() or write(), while any library functions that need
the standard printf or write for error messages will still get the
standard version, not the user's.  printf is rather obvious and not
likely to be accidentally redefined by a user; there are other
names that aren't so obvious (e.g. BSD's index), and such a scheme
prevents accidents.  If a user really wants to redefine a library
function, he can use its alternate internal name.

It would have been nice if ANSI had required some such scheme.

It would also have been nice if ANSI had defined its standard library
following its own reserved name rule (i.e. a leading underscore).
For instance the stdio functions would have been _printf(),
_putchar(), etc.  For compatibility with existing implementations
a <compat_bsd.h> or <compat_whitesmiths.h> or <compat_KandR.h> or
whatever could have been provided with appropriate defines to map
the names or even the calling sequences.

This would also solve the problem of names such as isascii()
being reserved but undefined.  Any library that now uses or defines
isascii() is in violation of the standard.

gwyn@brl-smoke.ARPA (Doug Gwyn ) (01/08/87)

In article <4207@watmath.UUCP> rbutterworth@watmath.UUCP (Ray Butterworth) writes:
>...  Any library that now uses or defines
>isascii() is in violation of the standard.

I'm not sure that's true.  The only reference to this I found
was in 4.13.1, which indicates a possible future direction for
evolution of the standard, as a warning to users and implementors.
That doesn't keep them from using is[a-z]* functions now, although
if one does so he may have to change his code later to conform to
a future revision of the standard.