[comp.lang.c] ANSI C -- site identification

minow@decvax.UUCP (Martin Minow) (12/14/86)

This is one of a collection of comments on the Draft Standard, posted to
comp.lang.c for discussion before I mail a final draft to the Ansi C
committee.  Each message discusses one problem I have found with the Draft
Standard that I feel warrants a "no" vote.  Note that this message is my
personal opinion, and does not reflect on the opinions of my employer.

---- Problem:

Page 82, line 34. Many, if not all, existing implementations pre-define
implementation-specific preprocessing variables that specify the processor,
operating system, and, in some cases, the compiler name.  For example, Decus
C predefines ``pdp11'', ``decusc'', and either ``rt11'' or ``rsx.''  For
better or for worse, this tradition is almost ten years old.  Some provision
should be made for this in Standard C.

---- Motivation:

Page 82, line 34.  A provision is needed to permit implementation-specific
preprocessor definitions.  The existing practice of predefining a
more-or-less random collection of variables does not work well, but the
capability is essential to anyone writing portable programs.

The following is presented as a possible solution, but I would note
that it has never been implemented:

  -- Extend the syntax of the #if statement expression to permit comparing
     two strings.  The relational and equality operators would use the
     strcmp() function when both operands are quoted strings.

  -- Add (at least) the following predefined variables to the preprocessor
     (the values are implementation dependent character strings, and are
     defined as \verb+""+ if unspecified).  These variables may be
     un- and re-defined by the program.

     __PROCESSOR__	defines the processor that the compiler is
			targetted for.

     __SYSTEM__		defines the operating system that the compiler is
			targetted for.

     __COMPILER__	defines the compiler family name (this could be a
			manufacturer's name).

     __VERSION__	defines the compiler release or patch level.

     __HOST_SYSTEM__	defines the operating system on which the compiler
			is running (this is needed to specify #include
			file names).

When this is done, the programmer could write conditional expressions
such as

    #if __PROCESSOR__ == "pdp11" && __VERSION__ > "V4.01"
    ...
    #endif

I have done essentially the above in a number of programs, such as Decus cpp,
that operate on a variety of processor and operating system configurations.

The changes that would need to be made to the \verb+#if+ processor
are roughly as follows:

  -- The lexical analyser must accept string constants.

  -- The expression evaluator must test for the case where both
     operands are strings and a relational or equality operator
     is being evaluated.  If this is the case, strcmp() is called
     and the evaluation stack changed to appropriate numeric values
     (<, >, <=, >=, and == are straighforward; != requires some fudging).
     The evaluation then proceeds normally.

  -- The # and ## operators must be added, with appropriate precedence.
     Then, one could write

	#define RELEASE	1
	#define PATCH	3
	#if __VERSION__ >= #RELEASE ## "." ## #PATCH


----

Martin Minow
decvax!minow

gwyn@brl-smoke.ARPA (Doug Gwyn ) (12/15/86)

In article <110@decvax.UUCP> minow@decvax.UUCP (Martin Minow) writes:
>Page 82, line 34. Many, if not all, existing implementations pre-define
>implementation-specific preprocessing variables that specify the processor,
>operating system, and, in some cases, the compiler name.  For example, Decus
>C predefines ``pdp11'', ``decusc'', and either ``rt11'' or ``rsx.''  For
>better or for worse, this tradition is almost ten years old.  Some provision
>should be made for this in Standard C.

I missed the discussion on this, so I can't tell you why it's
specified the way it is, other than perhaps to avoid unpleasant
surprises such as:
	int	sun;	/* struct or union flag */
	int	sel;	/* selection value */
both of which get mangled by pre-#defines that I'm aware of.

There has been a suggestion made that such pre-defined names
could be handled in a special way, to avoid this kind of problem,
but so far I haven't seen a really appealing proposal for this.

>     __PROCESSOR__	defines the processor that the compiler is
>			targetted for.

The problem with such things is that they are only useful if
there is standardization of the "official" formats and lists
of possible values.  It is definitely outside the scope of
X3J11 to set up such schemes.

minow@decvax.UUCP (Martin Minow) (12/17/86)

Commenting on my suggestion for some mechanism for site identification,
Doug Gwyn (@ brl.smoke.arpa) notes problems with
	int	sun;
(when sun was #defined).  One hackish solution would be to define the site as
	#define	sun	sun
Infinite expansion is prevented by the restriction in page 79, line 33ff.

My off the wall suggestion to extend preprocessing to allow strings
(assuming it's workable), solves the problem by defining __PROCESSOR__
(etc.) symbols that have implmentation-defined *string* content.
Thus, there is no registry that assigns numbers to implementors.
Since implementors generally trademark their names, there's no
real risk of spoofing.

Again, sorry about the length.

Martin Minow
decvax!minow

rpw3@amdcad.UUCP (Rob Warnock) (12/18/86)

+---------------
| My off the wall suggestion to extend preprocessing to allow strings
| (assuming it's workable), solves the problem by defining __PROCESSOR__
| (etc.) symbols that have implmentation-defined *string* content.
| Thus, there is no registry that assigns numbers to implementors.
| Since implementors generally trademark their names, there's no
| real risk of spoofing.  | Martin Minow | decvax!minow
+---------------

Well, you could always standardize on the machine & system names
listed in RFC960 (which are all-caps strings). Though to make it
useful, you would have to add a string-equality operator to the
pre-processor... ;-}

#if streql(__PROCESSOR__,"VAX-11/780")
...
#endif


Rob Warnock
Systems Architecture Consultant

UUCP:	{amdcad,fortune,sun}!redwood!rpw3
DDD:	(415)572-2607
USPS:	627 26th Ave, San Mateo, CA  94403

gnu@hoptoad.uucp (John Gilmore) (12/19/86)

Martin Minow proposed a way for a program to figure out what machine it
is being compiled for by using __XXX__ names.  Rob Warnock proposed
that these names take values assigned by the Internet folks.  Both parties
missed the boat.

The problem is not that we have no way to tell what machine is being
compiled for.

The problem is that we have a few million tons of code that does so using
#ifdef (and, occasionally, #if), and ANSI C disallows this.

People have suggested that predefining e.g. sun as sun would fix this.
This fixes it for #ifdef and for use in code, but does not fix it for
#if.  If I say

#define sun sun
#if sun

what is the result?  Maybe I am slo tonight, but the result is not
obvious to me...

I think that requiring such configuration #define's to define an identifier
to itself is better than nothing.  Does anyone know of a system that does
that now?  If not, I can't say that we understand the consequences of such
a change.

John Rogers (fortune!foros1!jr) compiled a list of predefined CPP
symbols in 1984 and claimed to be maintaining the list.  I won't tell
you what they all mean, but here is the 1984 list:

AOSVS, DATAGENERAL, DGUX, I8086, ON_SEL, PDP11, PWB, RES, RT, TM_DPS6,
TM_L66, TS, TS_GCOS, TS_MOD400, V7, VIII, VV, __DATE__, __FILE__,
__LINE__, __PAGE__, aegis, aosvs, apollo, cpm, datageneral, decus,
dgux, ebcdic, gcos, hp9000s200, hp9000s500, ibm, ibm370, interdata,
kl10, lint, m68000, m68k, mbb, mc68000, mert, mts, nomacarg, ns32000,
orion, os, pdp11, pe3200, pyr, rsx, sel, selport, sun, tahoe, tops20,
tss, u370, u3b, u3b2, u3b5, univac, unix, vax, vax11c, vms, z8000.

He also proposed adding:

bds, ccpm86, cpm68k, cpm80, cpm86, gnu, i80186, i80286, i8080,
i8086, mc68008, mc68010, mc68020, mpm, msdos, pcdos, power5,
xinu, z80, z800.

In a slightly related area, I second Martin Minow's request for a list
of all the predefined words ("keywords" and library routines and
#define's) in ANSI C.  I suspect that if people saw all the new words
in one place, they would chop back the list, or move those hundreds of
words into the "prefix _" category.  Since the standards folks seem
disinclined, anybody feel like reading the whole text and compiling the
list?  (gee, sounds like another job for machine readable text...)

-- 
John Gilmore  {sun,ptsfa,lll-crg,ihnp4}!hoptoad!gnu   jgilmore@lll-crg.arpa
Call +1 800 854 7179 or +1 714 540 9870 and order X3.159-198x (ANSI C) for $65.
Then spend two weeks reading it and weeping.  THEN send in formal comments!