[comp.lang.c] Non-compiled source text?

minow@mountn.dec.com (Martin Minow) (12/15/90)

A compiler I occassionally use rejected the following source text:

	#if 0
	    this doesn't work
	#endif

complaining of an unterminated character constant.  My reading of the ANSI
standard would permit this (and the compilers I generally use have no
trouble with it).  According to a developer (of a "working" compiler)
I contacted, Section 3.8 says that text between #if and #endif must
consist of "preprocessor tokens."  The last item on the list of preprocessor
tokens in 3.1, is "each non-white-space character that cannot be one
of the above."

The intent of the #if ... #endif is to "skip sections of source files
conditionally" -- this is what I'm doing.  What is the real intent
of the standard?

Martin Minow
minow@bolt.enet.dec.com

steve@taumet.com (Stephen Clamage) (12/16/90)

minow@mountn.dec.com (Martin Minow) writes:

>A compiler I occassionally use rejected the following source text:

>	#if 0
>	    this doesn't work
>	#endif

>complaining of an unterminated character constant.  My reading of the ANSI
>standard would permit this (and the compilers I generally use have no
>trouble with it).

Read the standard again.  In section 3.8 and 3.8.1 it is made explicit
that #ifdef and #ifndef must be followed by an identifier and then a
newline (other white space excepted).  "#if 0" is a syntax error.
-- 

Steve Clamage, TauMetric Corp, steve@taumet.com

henry@zoo.toronto.edu (Henry Spencer) (12/17/90)

In article <2061@mountn.dec.com> minow@bolt.enet.dec.com (Martin Minow) writes:
>... text between #if and #endif must
>consist of "preprocessor tokens."  The last item on the list of preprocessor
>tokens in 3.1, is "each non-white-space character that cannot be one
>of the above."

You have omitted an important fine point:  if ' or " occurs in such a
context that it can only fall under that last item, the effect is undefined.
So a compiler is entitled to reject such text.

>The intent of the #if ... #endif is to "skip sections of source files
>conditionally" -- this is what I'm doing.  What is the real intent
>of the standard?

To skip sections of *source* files conditionally.  There is no implication
that the skipped sections can contain arbitrary trash.  The C preprocessor
is not a general-purpose macro processor.
-- 
"The average pointer, statistically,    |Henry Spencer at U of Toronto Zoology
points somewhere in X." -Hugh Redelmeier| henry@zoo.toronto.edu   utzoo!henry

bhoughto@pima.intel.com (Blair P. Houghton) (12/17/90)

In article <542@taumet.com> steve@taumet.com (Stephen Clamage) writes:
>minow@mountn.dec.com (Martin Minow) writes:
>>A compiler I occassionally use rejected the following source text:
>>	#if 0
>>	    this doesn't work
>>	#endif
>>complaining of an unterminated character constant.  My reading of the ANSI
>>standard would permit this (and the compilers I generally use have no
>>trouble with it).
>
>Read the standard again.  In section 3.8 and 3.8.1 it is made explicit
>that #ifdef and #ifndef must be followed by an identifier and then a
       ^^^^^      ^^^^^^
These

>newline (other white space excepted).  "#if 0" is a syntax error.
					  ^^
			      are not this

The line

	#if 0

is just fine.

However, the line

	this doesn't work

contains a single-quote character beginning a
character constant that contains a newline,
in violation of the definition of a character constant.

The relevant citation is (ANSI X3.159-1989, section
3.1.3.4, p. 29, ll. 10-19).

Specifically, line 18 prohibits newlines.

Henry indicated that what's between the conditional
directives isn't necessarily allowed to be any more bogus
than what's before and after them; however, sec. 3.8.1, p.
88, ll. 20-24, seem to indicate that lines not to be
included are skipped.  But, sec. 3.8, p. 87, l. 12, uses
the phrase "The implementation can process and skip
sections of source files conditionally," which makes it all
ambiguous.  Can it process before skipping? Or skip without
processing? Or what?

				--Blair
				  "Can it, really?"

henry@zoo.toronto.edu (Henry Spencer) (12/18/90)

In article <1422@inews.intel.com> bhoughto@pima.intel.com (Blair P. Houghton) writes:
>Henry indicated that what's between the conditional
>directives isn't necessarily allowed to be any more bogus
>than what's before and after them; however, sec. 3.8.1, p.
>88, ll. 20-24, seem to indicate that lines not to be
>included are skipped...

However, before you even think about reading 3.8, you should read 2.1.1.2,
"Translation Phases".  The transformation from input text to preprocessor
tokens happens in phase 3, before anything in 3.8 is applicable... so it
really is required that the whole file be representable as pp-tokens.
It's not until phase 4 that #if and friends get executed.
-- 
"The average pointer, statistically,    |Henry Spencer at U of Toronto Zoology
points somewhere in X." -Hugh Redelmeier| henry@zoo.toronto.edu   utzoo!henry

thorinn@rimfaxe.diku.dk (Lars Henrik Mathiesen) (12/19/90)

bhoughto@pima.intel.com (Blair P. Houghton) writes:
>>minow@mountn.dec.com (Martin Minow) writes:
>>>	#if 0
>>>	    this doesn't work
>>>	#endif

>However, the line

>	this doesn't work

>contains a single-quote character beginning a
>character constant that contains a newline,
>in violation of the definition of a character constant.

>The relevant citation is (ANSI X3.159-1989, section
>3.1.3.4, p. 29, ll. 10-19).

>Specifically, line 18 prohibits newlines.

Agreed, character constants can't contain newlines. Therefore, the
single-quote properly falls within the category of preprocessing-token
mentioned by another poster: "Each non-white-space character that
cannot be one of the above" (``the above'' are header-name,
identifier, pp-number, character-constant, string-literal, operator
and punctuator). The behaviour when that happens to a single- or
double-quote is explicitly undefined.

Because the behaviour is undefined, a conforming implementation might
choose to allow newlines in character-constants. But since it violates
the syntax rule you quote, a diagnostic must be given anyway.

But a conforming implementation might also choose to make the
single-quote into a pp-token on its own. In the current example, it
will then be skipped in translation phase 4. If it was not,
translation phase 7 would attempt to convert it to a token, and the
constraint of section 3.1 would be violated, causing a diagnostic.
(This behaviour adds some complexity to the scanner, so I don't expect
to see many compilers implementing it.)

--
Lars Mathiesen, DIKU, U of Copenhagen, Denmark      [uunet!]mcsun!diku!thorinn
Institute of Datalogy -- we're scientists, not engineers.      thorinn@diku.dk