[net.lang.c] Reserving identifiers for future use.

kpmartin@watmath.UUCP (Kevin Martin) (07/19/84)

>From phipps@fortune.UUCP (Clay Phipps):
>Why do I continually run across the suggestion that prefixing names
>with underscores makes them unique, as if no one has used underscore
>characters for some special purpose before ?  
It doesn't make them unique. However, it does allow the documentation
to state that if the user supplies his own identifiers which begin
with underscore, he will (eventually) get burned. This effectively
reserves 1/27th of the possible identifiers (1/53rd if case distinction
is used).
No matter what naming convention is used for "reserved names", there will
exist some programs which already use such names.

>Sorry, the underscore has already been spoken for.
>The VAX C compiler uses that convention for external names, for example.
>Conventional VAX UN*X subroutines names, therefore, all begin with "_".
Only if you look at the 'as' input or beyond. Any C identifiers which
already start with an underscore end up starting with two.

>Making names all upper case isn't adequate, either.
I agree. May people make all of their #define'd symbols uppercase.

>What is really needed is a name qualification or prefixing convention
>that can be applied across all of UN*X, for example,
>
>    <prefix> "_" <mnemonic name>
>
>The prefix would be the name of the program or routine package;
>for example, "cpp" for the C Preprocessor, "lp" for the Pascal Library, &c.
>Thus, "unix" would become, for example, "cpp_unix", 
>"waterloo" would be "cpp_waterloo", and Pascal Library "IN" could be "lp_in".
The problem with this is that it effectively reserves *every* name containing
an underscore (since the user has no clue as to what might become a
'prefix' in the future). Besides, Joe User shouldn't have to know that the
pass that happens to process an identifier was called 'cpp' in the twilight
ages of computing. With all this talk of #if sizeof(...), CPP might well have
to disappear as a separate entity.
(Also, one would hope that the naming convention would be useful and used
on non-unix systems too)

There actually seem to be two problems here. Both involve conflicts between
user's identifiers and internally-generated identifiers. I suspect that the
distinction between the problems stems from whether the identifier occurs
in the original source code or not.

For example, the problems with C or Pascal operators implemented as function
calls is easily solved: User's external symbols all have an underscore
prepended, the builtin operators don't. Thus a user function can never
conflict with built-in functions (like 'in', which bacomes 'lp_in', as
suggested above).

On the other hand, symbols like 'unix' or '_flsbuf', which (eventually) appear
in the source code cannot benifit from this solution. They require a naming
convention. The simplest convention is "Any identifier containing an
underscore may become reserved in the future". This is unacceptable, since
it leaves no 'break character' for users' identifiers. Another simple rule
is "Any identifier ending with an underscore ...", but this gets clipped by
the loss of trailing characters in various compilers and linkers.
"Any identifers beginning with underscore ..." only uses a small fraction
of the available identifers and is easy to describe.

                           Kevin Martin, U. of Waterloo

rcd@opus.UUCP (07/27/84)

From earlier postings - granted, underscores don't solve the problem.  They
allow only one distinction, which is to say they divide the universe of
identifiers into two classes, and that's not enough.

I think a better solution to the problem lies along the lines of grouping
identifiers into (named) scopes.  Simula has this in a sense in the rules
for classes.  More recently, Modula provides mechanisms which allow/require
explicit declarations to get identifiers into a particular piece of code,
either without qualification or only when a specific qualifier is given.
(I'm intentionally being fuzzy; I don't want to discuss the whole business
here.  The general idea is sufficient.)  To show how this might work:  If
you know that the identifiers which are supplied by the compiler are part
of a particular set of names, you explicitly state that those names are
to be made accessible when you need them in a module.  This doesn't
completely solve the problem (since there are still possibilities for
overlap in the name space of the containers of identifiers), but it goes a
long way.
-- 
Dick Dunn	{hao,ucbvax,allegra}!nbires!rcd		(303)444-5710 x3086
	...A friend of the devil is a friend of mine.