[net.lang.c] Draft standard questions

chris@umcp-cs.UUCP (07/14/86)

>In article <2364@umcp-cs.UUCP> I wrote:
>>...  The draft standard says that because `sin' appears here
>>without a following `(', it is not macro-expanded; thus this in
>>fact passes the address of the library function.

In article <1054@ttrdc.UUCP> levy@ttrdc.UUCP (Daniel R. Levy) replies:
>If so, this is a "new" feature at least compared with AT&T SysV (could
>someone check out BSD and others?) C compilers, which give an error message
>about "argument mismatch" if something like this is tried.

As far as I know, the current preprocessor description in the
standard is unlike any and all available preprocessors.  According
to Jim Williams there is still some argument over just what is to
be `correct' behaviour.  (It is handy, this knowing someone who
actually goes to these meetings.  I hope I have not too badly
mis-paraphrased Jim.)  This is substantiated by the minutes sent
out to `alternate members'; there were, it seems, some four different
formal descriptions proposed.  If there is enough interest I will
borrow Fred's minutes again and post an extract.  (I am not even
an alternate member: `interested bystander' is rather more accurate.)

>I thought standards were supposed to, er, be codifications of a
>combination of the best features in existing compilers, not invent
>things out of the blue sky...

There are some rather major variances between existing C compilers,
and in each such case the committee must either pick one, or invent
something new.  Most of the `inventive stuff' in the part traditionally
done by /lib/ccom comes from C++, with which, at least, people have
had experience.  I have reservations about how well some of the
other `inventions' will work.

By the bye, that reminds me: Jim was going to mention this, but I
have not seen it, so I will do it.  The current draft standard
contains two new preprocessor operators, the `stringiser' `#' and
the token-concatenater `##'.  For example,

	#define glorp(a, b) glump("a " #a " string", _##b)
	glorp(strange, thing)

turns into

	glump("a strange string", _thing)

In one of the meetings someone proposed using `#"' and `#&', as
they are at least slightly more suggestive as to what is occurring:

	#define	glorp(a, b) glump("a " #"a " string", _#&b)

This was quickly followed by another proposal:

	#define	glorp(a, b) glump("a " __STR(a) " string", __CAT(_, b))

Naturally, this latter proposal caused much stir, and most likely
due to lack of time for consideration, both were rejected.

I think the current `#' and `##' are rather unfortunate forms, not
suggestive of anything, and leaving little room for future expansion.
`#"' and `#&' seem better---although I am uncertain of the first,
for it makes discovering what is quoted in unpreprocessed text more
difficult: a little more for programs, and, I think, much more for
programmers.  The final form is better still, but seems to have
rather more impact on the rest of the language (by which I mean I
am not ready to decide about it without first trying it).

Now then, if there is great outcry *and* agreement on the part of
the `C community' (as represented by net.lang.c), we may be able
to change this.  I would like to propose a modification of the
third form above:

	#define	glorp(a, b) glump("a " #str(a) " string", #cat(_, b))

This is somewhat readable (unlike `#' and `##', and `#"'), and
more importantly, expandable: if `#str' and `#cat' turn out to
be bad or insufficient, there are alternatives.  `#' and `##'
leaves `###', `####', `#####' ... or perhaps `###func(args)':
hardly aesthetic.

On the other hand, we could try to come up with a formal definition
of Resier-cpp semantics. . . .
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 1516)
UUCP:	seismo!umcp-cs!chris
CSNet:	chris@umcp-cs		ARPA:	chris@mimsy.umd.edu

gwyn@BRL.ARPA (VLD/VMB) (07/15/86)

__STR__ and __CAT__ were suggested by Tom Plum in a late-night
working session to resolve the remaining preprocessor issues,
after general committee rejection of Dave Prosser's innovative
proposals for prefix concatenation, stringize, and charize
operators.  These facilities have to be considered in the light
of a total specification of the phases of translation, which
involves such questions as whether the result of macro
substitution can be allowed to result in pasting of stuff on the
left of what was the current working set of tokens and rescan
of the result of that pasting.  Much of the difficulty comes
from an attempt to accommodate macro substitution on #include
lines, which some vendors support and claim their customers
really need for portability reasons (remember that different
OSes have different ideas of file names).

The preprocessor experts (I'm not one of them) seemed to think
that __STR__ and __CAT__ built-in macros was the best solution
under the constraints.  However, the committee members had not
had sufficient time to consider this proposal and didn't want
to risk adopting it without sufficient study.

The Reiser CPP is definitely out.  It can be modified with some
effort to conform with the current X3J11 proposal, but it has
too many technical problems to insist on everyone duplicating
its behavior.  In particular, many excellent C compilers handle
preprocessing as part of the general lexical analyzer rather
than as a totally separate pass; such tokenizing approaches
have to be accommodated by the standard.

I think Donn Seeley intends to upgrade the 4.nBSD CPP to X3J11.