[comp.lang.c] 3B2 cpp #ifdef + #include

matt@iquery.UUCP (Matt Reedy) (03/22/88)

Has anyone else noticed this on the 3B2 (using C Programming Languages
Issue 4 Version 0)?

Code fragment:

#ifdef XXXXX
junk junk
#include junk
#endif

main ()
{
	return 0;
}

If it is compiled using 'cc -c -O junk.c', cpp says:

junk.c: 3: bad include syntax

Why is cpp trying to interpret the #include when the #ifdef is NOT true?
It doesn't complain about the 'junk junk' line, only the #include line!

-- 
Matthew Reedy				harvard!adelie!iquery!matt
Programmed Intelligence Corp.		(512) 822 8703
830 NE Loop 410, Suite 412		"just ONE MORE compile...."
San Antonio, TX  78209-1209

gandalf@csli.STANFORD.EDU (Juergen Wagner) (03/27/88)

The problem with your cpp is that all the lines starting with # have to
be valid syntax, i.e. "junk junk" is not relevant because it is just text
and cpp doesn't know if this is correct or not, but "# include junk" is
invalid because is violates cpp syntax.

BTW, I tried it on Suns, and there are no error messages.

-- 
Juergen Wagner,			           gandalf@csli.stanford.edu
Center for the Study of Language and Information (CSLI), Stanford CA

karl@haddock.ISC.COM (Karl Heuer) (03/30/88)

In article <3127@csli.STANFORD.EDU> gandalf@csli.stanford.edu (Juergen Wagner) writes:
>The problem with your cpp is that all the lines starting with # have to
>be valid syntax, i.e. "junk junk" is not relevant because it is just text
>and cpp doesn't know if this is correct or not, but "# include junk" is
>invalid because is violates cpp syntax.

"#include junk" is legal (in ANSI C, anyway) if "junk" is a macro that expands
into a quoted or embroketed header name.  Thus,
  #ifdef junk
  #include junk
  #endif
should be legal.  This is indeed a bug in certain preprocessors.  Similarly, I
believe that the warning produced by
  #ifdef vms
  #ifdef foo$bar /* a valid name, in VMS */
  #endif
  #endif
on non-VMS systems is inappropriate, though I haven't yet determined what the
latest dpANS says about this.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
Followups to comp.lang.c only.

davidsen@steinmetz.steinmetz.ge.com (William E. Davidsen Jr) (03/30/88)

In article <3210@haddock.ISC.COM> karl@haddock.ima.isc.com (Karl Heuer) writes:
| [...]
| should be legal.  This is indeed a bug in certain preprocessors.  Similarly, I
| believe that the warning produced by
|   #ifdef vms
|   #ifdef foo$bar /* a valid name, in VMS */
|   #endif
|   #endif
| on non-VMS systems is inappropriate, though I haven't yet determined what the
| latest dpANS says about this.

My notes from the Seattle X3J11 meeting at which identifiers were
defined states that "upper and lower case alphabetics, digits, and the
underscore character" may be used, and that "the first character may not
be a digit." Unless that idea has been changed the $ is not a legal
character for an identifier. I believe that anything goes in a quoted
filename."

As a historical note that section was written on the monorail in Seattle
after a working lunch...
-- 
	bill davidsen		(wedu@ge-crd.arpa)
  {uunet | philabs | seismo}!steinmetz!crdos1!davidsen
"Stupidity, like virtue, is its own reward" -me

karl@haddock.ISC.COM (Karl Heuer) (03/31/88)

In article <10171@steinmetz.steinmetz.ge.com> davidsen@crdos1.UUCP (bill davidsen) writes:
>In article <3210@haddock.ISC.COM> karl@haddock.ima.isc.com (Karl Heuer) writes:
>My notes from the Seattle X3J11 meeting at which identifiers were
>defined states that "upper and lower case alphabetics, digits, and the
>underscore character" may be used, and that "the first character may not
>be a digit." Unless that idea has been changed the $ is not a legal
>character for an identifier.

I've found the relevant statement in 3.8.1: "[when a group is skipped]
directives are processed only through the name that determines the directive
in order to keep track of the level of nested conditionals; the rest of the
directives' preprocessing tokens are ignored, as are the other preprocessing
tokens in the group."  I read this as saying that in my example,
   #ifdef __VMS
   #ifdef foo$bar
   #endif
   #endif
the non-VMS system is obliged to stop reading after the word "ifdef" in the
second line, and NOT complain about the "$".  This also justifies my previous
statement that "#include junk" should not have provoked an error.

(Btw, since "$" isn't even a legal character in a strictly conforming program,
the VMS implementation can allow it to appear in identifiers (this is a Common
Extension, see A.6.5.2) and still be conforming.  I think.)

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

rcvie@tuvie (ELIN Forsch.z.) (03/31/88)

In article <10171@steinmetz.steinmetz.ge.com> davidsen@crdos1.UUCP (bill davidsen) writes:
>In article <3210@haddock.ISC.COM> karl@haddock.ima.isc.com (Karl Heuer) writes:
>>   #ifdef foo$bar /* a valid name, in VMS */
>
>My notes from the Seattle X3J11 meeting at which identifiers were
>defined states that "upper and lower case alphabetics, digits, and the
>underscore character" may be used, and that "the first character may not
>be a digit." Unless that idea has been changed the $ is not a legal
>character for an identifier. I believe that anything goes in a quoted
>filename."

Dollar signs are valid parts of identifiers in VAXC, as many routines
in the run-time library use them (and the routines have to be accessible
from VAXC either). Thus these names are normally *hidden* be `#ifdef vms'
constructs if portability is needed (as are other system dependent things).
Noone would really expect cpp to claim about it.

K&R seem to have had this point of view, although not clearly stating:

"If the checked condition is true then any lines between #else and #endif
 are ignored. If the checked condition is false then any lines between
 the test and an #else or, lacking an #else, the #endif, are ignored.
	These constructions may be nested."

ANSI, however, states explicitely:

"Each directive's condition is checked in order. If it evaluates to
 false (zero), the group that it controls is skipped: directives are
 processed only through the name that determines the directive to keep
 track of the level of nested conditionals; the rest of the directives'
 preprocessing tokens are ignored, as are the other preprocessing tokens
 in the group. ..."

		Dietmar Weickert,
			ALCATEL-ELIN Research Center, Vienna, Austria.

les@chinet.UUCP (Leslie Mikesell) (04/01/88)

>| believe that the warning produced by
>|   #ifdef vms
>|   #ifdef foo$bar /* a valid name, in VMS */
>|   #endif
>|   #endif


In the case of the 3B2 compiler, this will give a
 "bad include syntax"
error regardless of the contents of the filename.  Anything not
inside of quotes or <>'s will give this error (in spite of the 
fact that it is inside of a false #ifdef).  Does VMS require the
filename to be unquoted?
  Les Mikesell

karl@haddock.ISC.COM (Karl Heuer) (04/02/88)

In article <4419@chinet.UUCP> les@chinet.UUCP (Leslie Mikesell) writes:
>>|   #ifdef vms
>>|   #ifdef foo$bar /* a valid name, in VMS */
>>|   #endif
>>|   #endif
>
>In the case of the 3B2 compiler, this will give a "bad include syntax" error
>regardless of the contents of the filename.  Anything not inside of quotes or
><>'s will give this error (in spite of the fact that it is inside of a false
>#ifdef).  Does VMS require the filename to be unquoted?

As I indicated in a recent posting, the fact that the SVR3 cpp pays attention
within a false #ifdef must be considered a bug.  But you misread what you
quoted; that was an "#ifdef", not an "#include".  It was complaining about the
"$" in the identifier passed to "#ifdef".

There *is* another bug of the same nature, where SVR3 cpp complains about
"#include junk".  This is independent of VMS: it's quite legal (in ANSI C,
anyway) to say "#define junk <stdio.h>" and then "#include junk".

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

ekb@ho7cad.ATT.COM (Eric K. Bustad) (04/02/88)

In article <4419@chinet.UUCP>, les@chinet.UUCP (Leslie Mikesell) asks:
> Does VMS require the filename [in #include - ekb] to be unquoted?

According to my "Programming in VAX-11 C" manual, if you want it to search
a "text library", then yes!  I considered this pretty stupid.  When we ported
a large ammount of C code from VMS to UNIX, one of the first things we had
to do was to change all of the "#include stdio"'s to "#include <stdio.h>"s.
This may have been fixed in later releases.

= ERic

henry@utzoo.uucp (Henry Spencer) (04/02/88)

> I've found the relevant statement in 3.8.1: "[when a group is skipped]
> directives are processed only through the name that determines the directive
> in order to keep track of the level of nested conditionals; the rest of the
> directives' preprocessing tokens are ignored...
> the non-VMS system is obliged to stop reading after the word "ifdef" in the
> second line, and NOT complain about the "$"...

Ah, but if the $ is not in the C character set the compiler is willing to
accept (outside comments and strings), then $ is arguably *not* a
preprocessing token at all!  Section 2.2.1 makes the compiler's action
in this case undefined, so the compiler is not wrong to complain.

(Note that I say "arguably"; there is some hairsplitting that could be
invoked here as a result of some fuzzy wording.)
-- 
"Noalias must go.  This is           |  Henry Spencer @ U of Toronto Zoology
non-negotiable."  --DMR              | {allegra,ihnp4,decvax,utai}!utzoo!henry

sarge@sham.Berkeley.EDU (Steven Sargent) (04/02/88)

In article <109@iquery.UUCP>, matt@iquery.UUCP (Matt Reedy) writes:
> Has anyone else noticed this on the 3B2 (using C Programming Languages
> Issue 4 Version 0)?
> 
> Code fragment:
> 
> #ifdef XXXXX
> junk junk
> #include junk
> #endif
> 
> main ()
> {
> 	return 0;
> }
> 
> If it is compiled using 'cc -c -O junk.c', cpp says:
> 
> junk.c: 3: bad include syntax
> 
> Why is cpp trying to interpret the #include when the #ifdef is NOT true?
> It doesn't complain about the 'junk junk' line, only the #include line!
> 
> -- 
> Matthew Reedy				harvard!adelie!iquery!matt
> Programmed Intelligence Corp.		(512) 822 8703
> 830 NE Loop 410, Suite 412		"just ONE MORE compile...."
> San Antonio, TX  78209-1209

Ain't just 3b2, y'know: anybody using C 4.0 gets the same "benefit."
I have to compile on UNIX and VMS, and so ran into this myself.
Apparently TPC is running a full-employment program, and rather than
building roads and dams, they keep busy by making annoying changes
to cpp... here's another one:

#ifdef vms
do$vms$stuff$with$lots$of$signs
#else vms
do something else
#endif vms

The new improved cpp prints warnings about "tokens after directive
(ignored)" after the #else and #endif lines.  Not fatal to the
compilation, but distracting to look at, and no help.  (Yes, I can
sed my source and fix the damn things -- certainly
#endif /* vms */
is an acceptable solution.  But why do they bother?)

S.
---
"I'm sorry... you must have me confused with some other plate-lipped
white girl named Irene."
				-- Good Girls #2

Steven Sargent
ARPA Internet: sarge@scam.berkeley.edu
MILnet: sarge%scam.berkeley.edu@ucbvax.berkeley.edu
TPCnet: {anywhere at all, really}!ucbvax!scam!sarge

matt@iquery.UUCP (Matt Reedy) (04/02/88)

>   #ifdef vms
>   #ifdef foo$bar /* a valid name, in VMS */
>   #endif
>   #endif
> on non-VMS systems is inappropriate, though I haven't yet determined what the
> latest dpANS says about this.
> 
This is exactly the problem which prompted my first posting.  I have some
code which I want to be portable across many systems - VMS being one of them.
So, the code looks like this:

	#ifdef VMS
	#include file
	#endif

where '#include file' is valid Vax-C syntax ('file' is the name of a module
in a standard text library).  The problem is that the 3B2 chokes on that
line, even though VMS is NOT defined.  Oh well...


-- 
Matthew Reedy				harvard!adelie!iquery!matt
Programmed Intelligence Corp.		(512) 822 8703
830 NE Loop 410, Suite 412		"just ONE MORE compile...."
San Antonio, TX  78209-1209

rbutterworth@watmath.waterloo.edu (Ray Butterworth) (04/03/88)

In article <2023@pasteur.Berkeley.Edu>, sarge@sham.Berkeley.EDU (Steven Sargent) writes:
> #else vms
> do something else
> #endif vms
> 
> The new improved cpp prints warnings about "tokens after directive
> (ignored)" after the #else and #endif lines.  Not fatal to the
> compilation, but distracting to look at, and no help.  (Yes, I can
> sed my source and fix the damn things -- certainly
> #endif /* vms */
> is an acceptable solution.  But why do they bother?)

They bother because it is nice to be warned if you entered something
that looks like it means something but doesn't.  There is nothing
worse (well not really) than software that silently ignores input that
it doesn't understand.

A better example would be:
    #ifdef vms vax
Did you really mean:
    #if defined(vms) && defined(vax)
or perhaps:
    #if defined(vms) || defined(vax)
In fact, most CPPs take it to mean:
    #ifdef vms /*vax*/

I for one certainly appreciate it when CPP warns me that it has
ignored something that I didn't explictly mark as a comment.

Similarly, how would you like it if on some system
prompt-> mail jerome laurence maurice
only sent mail to the first name on the list and silently
ignored Larry and Moe?
Designing a program that only looks at the first argument
and silently ignores the rest would be really silly right?


Also, as a matter of style,
   #ifdef vax       or #if defined(vax)
   #else /*vax*/
   #endif /*vax*/
isn't nearly as good as
   #ifndef vax      or #if !defined(vax)
   #else /*vax*/
   #endif /*vax*/
If the "if" and the "end" are far apart, the comments around the
   #else /*vax*/
   ... machine specific code ...
   #endif /*vax*/
certainly make it look to someone that has to maintain your code
that the enclosed section is the code intended for use when "vax"
IS defined, not when it ISN'T defined.

lvc@tut.cis.ohio-state.edu (Lawrence V. Cipriani) (04/03/88)

In article <17983@watmath.waterloo.edu> rbutterworth@watmath.waterloo.edu (Ray Butterworth) writes:
>> 
>They bother because it is nice to be warned if you entered something
>that looks like it means something but doesn't.  There is nothing
>worse (well not really) than software that silently ignores input that
>it doesn't understand.

You are right of course.  This is such a widely used idiom
though that cpp should be changed to allow it.  I heard ATT-IS
(or whatever its called these days) changed cpp to allow tokens
after the closing # directives as long as it (they?) was an
exact match with the corresponding opening # directive.  But,
incredibly it would complain about comments after the # directives!
This is so stupid!  Maybe I should sell my AT&T stock.  Sigh.
Is anyone at ATT-IS listening that would care to admit to
this botch if it is true?
-- 
Larry Cipriani, AT&T Network Systems and Ohio State University
Domain: lvc@tut.cis.ohio-state.edu
Path: ...!cbosgd!osu-cis!tut.cis.ohio-state.edu!lvc (weird but right)

karl@haddock.ISC.COM (Karl Heuer) (04/05/88)

In article <125@iquery.UUCP> matt@iquery.UUCP (Matt Reedy) writes:
>'#include file' [without quotes or brokets] is valid Vax-C syntax ('file' is
>the name of a module in a standard text library).

Stupid of them.  There's no reason they couldn't use the same syntax as the
rest of the world (viz. #include <stdio.h>), and still have the "text library"
be part of the default search path.  Oh well, if they want to be ANSI, they'll
fix it.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint