[comp.std.c] Token pasting in #include directive

gwyn@smoke.BRL.MIL (Doug Gwyn) (11/26/89)

In article <11188@riks.csl.sony.co.jp> diamond@ws.sony.junet (Norman Diamond) writes:
>In article <11160@riks.csl.sony.co.jp> I asked about token pasting in
>the #include directive, and then snidely remarked,
>>>I must ask again which carries more weight, the stated rules or the
>>>examples.
>My real question does not appear to have been answered.  Only the
>snide subsidiary question seems to matter.  That's what I get from
>posting to usenet, eh?

I don't know WHY you post the questions you do.  They're often worded
like you're trying to show your intellectual superiority.  That may be
one of the reasons you elicit hostile responses.

>In article <1989Nov22.222413.3874@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
>>I must reply again :-), if you read section 1.4 very carefully, you will
>>discover that the examples are technically not part of the standard.
>Yes, but in other cases it has been determined that the examples
>properly reflect the committee's intent, while the words do not
>reflect the committee's intent.  And we are supposed to obey their
>intent instead of their words.

You are misrepresenting previous discussions.  Both the "words" (formal
specification) and the examples reflect the committee's intent.  In some
cases referring to the examples can help one understand the words, and
in practically all cases you have to refer to the words to understand
the examples.  The examples are not provided as a tutorial, but rather
to illustrate the application of the rules.

Henry's comment was beside the point.  We believe all the examples to be
correct.

No matter WHAT interpretation you personally place on the words, it is
inescapable that you're applying an interpretation of some sort,
involving your particular background, knowledge of English usage,
analogies with other technical writing, expectations about programming
languages, and so on.  X3J11 tried their best to anticipate possible
ambiguity and misunderstanding, and in fact provided examples and a
Rationale document to help guide readers of the Standard; however,
obviously they could not possibly guarantee that NObody would read the
specification as saying something unintended or draw wrong conclusions
from the specification.  This is simply an inevitable property of
language as such, and does not necessarily indicate a deficiency in
the specification.

I've seen a lot of programming specifications, and in my opinion X3.159
is among the very best.  That doesn't mean that it is easy to fully
comprehend nor that it is impossible to misunderstand it.  It may even
have a few genuine technical errors that went undetected through three
public reviews, but you're not pointing out cases of those, merely
(in my opinion, fairly perverse) ways of misinterpreting correct specs.

>The conclusion to my real question seems obvious now.  The pasting of
>tokens in the #include directive is implementation-defined, but all
>implementations must define it in the same manner as the example
>(which in fact requires a form of pasting which conflicts with the
>rules for preprocessing everything else in a source program).

No, both examples of macro replacement within an #include are correct
(I manually checked them), and all conforming implementations must
reproduce that behavior exactly.  Section 3.8.2 is quite explicit
about the processing of all three forms of #include.  There is nothing
implementation-defined about the examples, and the normal pasting rules
do apply.

jfh@rpp386.cactus.org (John F. Haugh II) (11/26/89)

In article <11685@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
>I don't know WHY you post the questions you do.  They're often worded
>like you're trying to show your intellectual superiority.  That may be
>one of the reasons you elicit hostile responses.

just put him in your KILL file and be done with him.  the last ten
or so totally stupid questions have come from his account.

>You are misrepresenting previous discussions.  Both the "words" (formal
>specification) and the examples reflect the committee's intent.  In some
>cases referring to the examples can help one understand the words, and
>in practically all cases you have to refer to the words to understand
>the examples.  The examples are not provided as a tutorial, but rather
>to illustrate the application of the rules.

put it another way, an example is merely ONE example.  there may
exist many other examples and a complete enumeration of all possibly
legal examples would deforest north america faster than the current
explosion of worthless UNIX manuals.

when an item is defined to be `implementation defined' the example
should be considered to be an example of how ONE implementation
may chose to define the behavior.
-- 
John F. Haugh II                        +-Things you didn't want to know:------
VoiceNet: (512) 832-8832   Data: -8835  | The real meaning of IBM is ...
InterNet: jfh@rpp386.cactus.org         |   ... I've Been to a Meeting.
UUCPNet:  {texbell|bigtex}!rpp386!jfh   +--<><--<><--<><--<><--<><--<><--Yea!--

afscian@violet.waterloo.edu (Anthony Scian) (11/27/89)

In article <11685@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
[pANSI examples discussion]
>Henry's comment was beside the point.  We believe all the examples to be
>correct.
What about the library prototypes that are coded "int foo( int x, int y )"
when they should be "int foo( int __x, int __y )"?

In these cases the example is not correct but more readable.
(disclaimer: this may have changed since I last saw the standard :^)

Anthony
//// Anthony Scian afscian@violet.uwaterloo.ca afscian@violet.waterloo.edu ////
"I can't believe the news today, I can't close my eyes and make it go away" -U2

diamond@csl.sony.co.jp (Norman Diamond) (11/27/89)

In article <11685@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:

>I don't know WHY you post the questions you do.

Because the standard says certain things that do not make much sense,
and I try to find out what it should have said, along with the
implication that it should be made to say what it should have said.

>They're often worded
>like you're trying to show your intellectual superiority.

Gee, I'm sorry if technical analysis and efforts to obey the rules are
intellectually superior to the development and coding of the rules.
I was not trying to show such superiority.

>That may be
>one of the reasons you elicit hostile responses.

Actually the responses did not seem particularly hostile, but I have a
tendency to become hostile to dishonest replies.  Such as suggestions
that the standard says things that it does not say.  Such as blaming
English for ambiguity in statements where the authors made absolutely
no attempt to begin writing those statements, i.e. where the standard
says nothing.

From one of my postings:

>>Yes, but in other cases it has been determined that the examples
>>properly reflect the committee's intent, while the words do not
>>reflect the committee's intent.  And we are supposed to obey their
>>intent instead of their words.

>You are misrepresenting previous discussions.

I am not.

>Both the "words" (formal
>specification) and the examples reflect the committee's intent.

In some cases it was determined that the "words" do not reflect the
committee's intent.

>It may even
>have a few genuine technical errors that went undetected through three
>public reviews,

It certainly does.

>but you're not pointing out cases of those

I certainly did.  So did several others, and they received the same
sort of dishonest answer from you.  For example, I have been watching
for you to apologize and admit that the standard does not limit an
identifier with external linkage to having only one initialization
(when the identifier is never used in an expression).  The poser of
that error seems less determined than I am to follow up on these errors.

>>The conclusion to my real question seems obvious now.  The pasting of
>>tokens in the #include directive is implementation-defined, but all
>>implementations must define it in the same manner as the example
>>(which in fact requires a form of pasting which conflicts with the
>>rules for preprocessing everything else in a source program).
>
>No, both examples of macro replacement within an #include are correct
>(I manually checked them), and all conforming implementations must
>reproduce that behavior exactly.  Section 3.8.2 is quite explicit
>about the processing of all three forms of #include.  There is nothing
>implementation-defined about the examples, and the normal pasting rules
>do apply.

Page 89, lines 14 to 17:  The method by which a sequence of
preprocessing tokens between a < and a > preprocessing token pair or a
pair of " characters is combined into a single header name preprocessing
token is implementation-defined.

Page 93, line 17:  #include xstr(INCFILE(2).h)

Page 93, line 23:  #include "vers2.h"

The method by which the '2' gets pasted to the '.' is implementation
defined, according to the "words" of the standard.  According to the
example, all implementations must define this pasting in the same
manner.  According to Doug Gwyn, "implementation-defined" does not
mean implementation-defined.

And you wonder why I'm hostile?

-- 
Norman Diamond, Sony Corp. (diamond%ws.sony.junet@uunet.uu.net seems to work)
  Should the preceding opinions be caught or     |  James Bond asked his
  killed, the sender will disavow all knowledge  |  ATT rep for a source
  of their activities or whereabouts.            |  licence to "kill".

diamond@csl.sony.co.jp (Norman Diamond) (11/27/89)

In article <17358@rpp386.cactus.org> jfh@rpp386.cactus.org (John F. Haugh II) writes:

>the last ten or so totally stupid questions have come from his account.

Try reading the sections of the standard that are under dispute, and
try noticing what it actually says.  The standard is just as stupid as
my questions, and that is the reason for my questions.

-- 
Norman Diamond, Sony Corp. (diamond%ws.sony.junet@uunet.uu.net seems to work)
  Should the preceding opinions be caught or     |  James Bond asked his
  killed, the sender will disavow all knowledge  |  ATT rep for a source
  of their activities or whereabouts.            |  licence to "kill".

tneff@bfmny0.UU.NET (Tom Neff) (11/28/89)

Note: since many others read these proceedings, including people Doug
and Henry aren't annoyed at, it would be better to confine the personal
venom to email and the straight technical answers (none of which have
appeared yet on this question) to the newsgroup.  If the Standard
requires no explication on specific points, let's close the newsgroup.

-- 
Stalinism begins at home.  }{  Tom Neff  }{  tneff@bfmny0.UU.NET

gwyn@smoke.BRL.MIL (Doug Gwyn) (11/28/89)

In article <17358@rpp386.cactus.org> jfh@rpp386.cactus.org (John F. Haugh II) writes:
>when an item is defined to be `implementation defined' the example
>should be considered to be an example of how ONE implementation
>may chose to define the behavior.

I don't recall ANY of the examples in the Standard attempting to
illustrate one interpretation of implementation-defined behavior.
Certainly the example under discussion was not one such.

gwyn@smoke.BRL.MIL (Doug Gwyn) (11/28/89)

In article <11193@riks.csl.sony.co.jp> diamond@ws.sony.junet (Norman Diamond) writes:
>In article <11685@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
>Because the standard says certain things that do not make much sense,

Just because they don't make sense to YOU does not necessarily mean
that they don't make sense.

>For example, I have been watching for you to apologize and admit
>that the standard does not limit an identifier with external linkage
>to having only one initialization (when the identifier is never used
>in an expression).

Perhaps the reason you have to keep waiting is that I have not been
convinced that the Standard has a problem in this area.  Section 2.1.2
makes the initialization model pretty clear, and combined with the
obvious notion that a variable holds a single value at a time it
precludes multiple simultaneous initializers.

>Page 89, lines 14 to 17:  The method by which a sequence of
>preprocessing tokens between a < and a > preprocessing token pair or a
>pair of " characters is combined into a single header name preprocessing
>token is implementation-defined.

Which is irrelevant since it is the THIRD form of #include (involving
neither "" nor <> until after macro replacement is complete) that we
are concerned with.

>The method by which the '2' gets pasted to the '.' is implementation
>defined, according to the "words" of the standard.

NO, it is NOT.  Macro replacement and stringizing is well defined and
does not permit variation in this respect.

>According to the example, all implementations must define this
>pasting in the same manner.

Right.  Also according to the words in the Standard.

>According to Doug Gwyn, "implementation-defined" does not mean
>implementation-defined.

No, that's according to you.  This is NOT NOT NOT implementation
defined, as I've said before.

>And you wonder why I'm hostile?

Probably has something to do with not listening to what others are
saying.

scjones@sdrc.UUCP (Larry Jones) (11/28/89)

In article <18672@watdragon.waterloo.edu>, afscian@violet.waterloo.edu (Anthony Scian) writes:
> What about the library prototypes that are coded "int foo( int x, int y )"
> when they should be "int foo( int __x, int __y )"?

Eh?  Why is the second any more correct than the first?  Since
the argument names in a prototype only have prototype scope, they
can't conflict with any other names in the program and therefor
do not need leading underscores.
----
Larry Jones                         UUCP: uunet!sdrc!scjones
SDRC                                      scjones@SDRC.UU.NET
2000 Eastman Dr.                    BIX:  ltl
Milford, OH  45150-2789             AT&T: (513) 576-2070
"You know how Einstein got bad grades as a kid?  Well MINE are even WORSE!"
-Calvin

chris@mimsy.umd.edu (Chris Torek) (11/28/89)

>In article <18672@watdragon.waterloo.edu> afscian@violet.waterloo.edu
>(Anthony Scian) writes:
>>What about the library prototypes that are coded "int foo( int x, int y )"
>>when they should be "int foo( int __x, int __y )"?

In article <970@sdrc.UUCP> scjones@sdrc.UUCP (Larry Jones) writes:
>Eh?  Why is the second any more correct than the first?  Since
>the argument names in a prototype only have prototype scope, they
>can't conflict with any other names in the program and therefor
>do not need leading underscores.

The following is (apparently---anyone who can supply text proving otherwise
is welcome to follow up) legal:

	#define x point.xpart
	#define	y point.ypart

	#include <math.h>

Thus, if <math.h> includes the line

	double pow(double x, double y);

the compiler will attempt to parse the expansion

	double pow(double point.xpart, double point.ypart);

which will give a syntax error.

However, the following is (apparently) illegal:

	#define _x point.x	/* this has file scope and is thus illegal */
	#define _y point.y	/* (at least, at this point) */

	#include <math.h>

hence <math.h> *could* include the line

	double pow(double _x, double _y);

Names such as `_a' (apparently) cannot exist with file scope% at the
time of a `#include some_standard_header', and names such as `_A'
(underscore followed by an uppercase letter or a second underscore)
are completely off-limits to users.

-----
% Maybe they can, provided they are not `#define' symbols, since
	static double _x() {
		return 3.14159265358979323/2.718281828459045235);
	}
	#include <math.h>
  seems unlikely to cause trouble.  But `#define x some,long,expr'
  is otherwise quite all right, but will discombobulate prototypes.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@cs.umd.edu	Path:	uunet!mimsy!chris

rhg@cpsolv.UUCP (Richard H. Gumpertz) (11/28/89)

In article <11193@riks.csl.sony.co.jp> diamond@ws.sony.junet (Norman Diamond) writes:
>Page 93, line 17:  #include xstr(INCFILE(2).h)

Gee, my copy seems to say:
         Page 93, line 17:  #include xstr(INCFILE(2) .h)
which makes the case even more convincing that the space belongs in line
29.  Only on VERY careful examination of the text could I be convinced
that the extra space is not really there and that the ")." combination
just appears to have a space in it. 

Moby sigh.  Typography strikes again?

-- 
===============================================================================
| Richard H. Gumpertz rhg%cpsolv@uunet.uu.NET -or- ...uunet!amgraf!cpsolv!rhg |
| Computer Problem Solving, 8905 Mohawk Lane, Leawood, Kansas 66206-1749      |
===============================================================================

peter@ficc.uu.net (Peter da Silva) (11/29/89)

Doug, Norman. Chill out.

I can see problems on both sides of this argument. Norm is (for whatever
reason) having trouble making sense of parts of the standard. Doug is (for
whetever reason) having trouble seeing how Norm could be confused. The rest
of us see Norman asking what sound like silly questions, but not having access
to the latest version of the standard we can't say whether or not they're
really silly. Then Doug comes back and says "no, the standard is right". And
when he's posted relavent quotes he seems to be right. Unfortunately, he's
lately stopped doing that.

Doug: instead of flaming Norman, why not just post the relevant sections
and let the facts stand? We'd all gain from it. And maybe Norman has a
point... from my own experience with standards design, sometimes you need
to step back a ways to see holes in a document. After you've been too close
to it for too long, it becomes a lot more obvious than it really is.

I realise that after all this time any such holes should have come out in
public review, but can it \hurt/ to have another look?
-- 
`-_-' Peter da Silva <peter@ficc.uu.net> <peter@sugar.lonestar.org>.
 'U`  --------------  +1 713 274 5180.
"The basic notion underlying USENET is the flame."
	-- Chuq Von Rospach, chuq@Apple.COM

gwyn@smoke.BRL.MIL (Doug Gwyn) (11/29/89)

In article <20961@mimsy.umd.edu> chris@mimsy.umd.edu (Chris Torek) writes:
>The following is (apparently---anyone who can supply text proving otherwise
>is welcome to follow up) legal:
>	#define x point.xpart
>	#define	y point.ypart
>	#include <math.h>
>Thus, if <math.h> includes the line
>	double pow(double x, double y);
>the compiler will attempt to parse the expansion
>	double pow(double point.xpart, double point.ypart);
>which will give a syntax error.

I believe that's a valid analysis, for implementations that process
<>-style #includes "the usual way".  However, conforming implementations
may implement standard headers by any of a number of techniques, some
of which would not involve applying externally-#defined macros (other
than NDEBUG) to the "contents" of the standard header.

For most implementations, however, it's a valid point to be taken into
consideration by the implementor.  There is an even subtler problem to
consider:

	#include <stdio.h>
	void foo() {
		int _iob;	/* not file scope, so not reserved */
		_iob = getchar();	/* getchar better not use _iob */
	}

I think Sue Meloy of H-P called attention to this among many other
implementation notes in an issue of the Journal of C Language Translation.

bill@twwells.com (T. William Wells) (11/29/89)

In article <970@sdrc.UUCP> scjones@sdrc.UUCP (Larry Jones) writes:
: In article <18672@watdragon.waterloo.edu>, afscian@violet.waterloo.edu (Anthony Scian) writes:
: > What about the library prototypes that are coded "int foo( int x, int y )"
: > when they should be "int foo( int __x, int __y )"?
:
: Eh?  Why is the second any more correct than the first?  Since
: the argument names in a prototype only have prototype scope, they
: can't conflict with any other names in the program and therefor
: do not need leading underscores.

Consider macros. E.g.:

#define source "you're screwed"
#include <string.h>

and assume that string.h has things like:

extern char *strcpy(char *dest, const char *source);

You make it, instead,

extern char *strcpy(char *__dest, const char *__source);

and that way the user won't screw up your prototypes.

---
Bill                    { uunet | novavax | ankh | sunvice } !twwells!bill
bill@twwells.com

rhg@cpsolv.UUCP (Richard H. Gumpertz) (11/30/89)

Maybe the standard header files used in an impleemtation should read something
like the following:
	extern char *strcpy(char * /*dest*/, const char * /*source*/);
-- 
===============================================================================
| Richard H. Gumpertz rhg%cpsolv@uunet.uu.NET -or- ...uunet!amgraf!cpsolv!rhg |
| Computer Problem Solving, 8905 Mohawk Lane, Leawood, Kansas 66206-1749      |
===============================================================================

rex@aussie.UUCP (Rex Jaeschke) (12/01/89)

> In article <18672@watdragon.waterloo.edu>, afscian@violet.waterloo.edu (Anthony Scian) writes:
> > What about the library prototypes that are coded "int foo( int x, int y )"
> > when they should be "int foo( int __x, int __y )"?
> 
> Eh?  Why is the second any more correct than the first?  Since

I raised this in a paper at the Nashua, NH X3J11 meeting in 1988. It 
was agreed that the prototypes in the library section of the draft 
were NOT CONFORMING since they were in the user's namespace. The 
result was that I sumbitted words for the Rationale document 
explaining how they should really have been defined. See page 22 
bottom half of the Rationale, Nov '88.

For those of you without that document my examples contained something 
like the following (among other things):

	#define str *
	#include <string.h>

Consider that in string.h you have:

	size_t strlen(char *str);

After macro substitution, the prototype becomes:

	size_t strlen(char **);

and all your calls to strlen will fail because the types don't match. 
The bottom line is that you must not be able to rewrite a standard 
prototype from within a conforming program.

Rex

----------------------------------------------------------------------------
Rex Jaeschke     |  Journal of C Language Translation  | C Users Journal
(703) 860-0091   |        2051 Swans Neck Way          | DEC PROFESSIONAL
uunet!aussie!rex |     Reston, Virginia 22091, USA     | Programmers Journal
----------------------------------------------------------------------------
Convener of the Numerical C Extensions Group (NCEG)
----------------------------------------------------------------------------

mcgrath@paris.Berkeley.EDU (Roland McGrath) (12/02/89)

In article <458@cpsolv.UUCP> rhg@cpsolv.UUCP (Richard H. Gumpertz) writes:

   Maybe the standard header files used in an impleemtation should read
   something like the following:
	   extern char *strcpy(char * /*dest*/, const char * /*source*/);

They can be:
	extern char *strcpy(char *__dest, const char *__source);
with no problems, since __foo is in the implementation's namespace.
--
	Roland McGrath
	Free Software Foundation, Inc.
roland@ai.mit.edu, uunet!ai.mit.edu!roland

steve@groucho.ucar.edu (Steve Emmerson) (12/02/89)

mcgrath@paris.Berkeley.EDU (Roland McGrath) writes:

>They can be:
>	extern char *strcpy(char *__dest, const char *__source);
>with no problems, since __foo is in the implementation's namespace.

True.  But what about the rest of us (i.e. non-implementors).  Should
we begin to "comment out" argument names in prototypes, or use
single leading underscores, or just hope for the best (with or without
documentation)?

--Steve Emmerson	steve@unidata.ucar.edu

gwyn@smoke.BRL.MIL (Doug Gwyn) (12/02/89)

In article <5522@ncar.ucar.edu> steve@groucho.ucar.edu (Steve Emmerson) writes:
>True.  But what about the rest of us (i.e. non-implementors).  Should
>we begin to "comment out" argument names in prototypes, or use
>single leading underscores, or just hope for the best (with or without
>documentation)?

C application developers have to deal with name space conflicts SOMEhow.
You should NOT start using _names, because those are reserved for C
implementions.  You can use any names that you're sure will not be
#defined before your header is included.  What those names are depends
on the name space partitioning rules you have adopted.  If you use the
"package prefix" notion I've described in previous postings, then names
incorporating the header's package prefix would be wise.