[comp.lang.c] Preprocessor question

ray@ole.UUCP (Ray Berry) (01/19/89)

   Here's an easy question regarding preprocessor string-izing and subsequent
  rescanning...

#define VAL 3
#define STR(x) #x
   .
   .
  puts("val is " STR(VAL));
- - - - - - - -
   Turbo C produces "3" for STR(VAL); Microsoft produces "VAL" and 
leaves it at that.  
   Which behavior is correct?  Further, if Microsoft's is correct, how can
one parenthesize a #define'd value to turn it into a character string?

-- 
Ray Berry  KB7HT uucp: ...{uw-beaver|uiucuxc}tikal!ole!ray CS: 73407,3152 
Seattle Silicon Corp. 3075 112th Ave NE. Bellevue WA 98004 (206) 828 4422

gwyn@smoke.BRL.MIL (Doug Gwyn ) (01/19/89)

In article <531@ole.UUCP> ray@ole.UUCP (Ray Berry) writes:
>#define VAL 3
>#define STR(x) #x
>  puts("val is " STR(VAL));
>   Turbo C produces "3" for STR(VAL); Microsoft produces "VAL" and 
>leaves it at that.  
>   Which behavior is correct?

"Before being substituted, each argument's preprocessing tokens are
completely macro replaced as if they formed the rest of the
translation unit; no other preprocessing tokens are available."

So VAL is first expanded to 3 then the #3 is performed to produce
"3".

curtw@hpcllca.HP.COM (Curt Wohlgemuth) (01/20/89)

ray@ole.UUCP (Ray Berry) writes:
> 
>    Here's an easy question regarding preprocessor string-izing and subsequent
>   rescanning...
> 
> #define VAL 3
> #define STR(x) #x
>    .
>    .
>   puts("val is " STR(VAL));
> - - - - - - - -
>    Turbo C produces "3" for STR(VAL); Microsoft produces "VAL" and 
> leaves it at that.  
>    Which behavior is correct?  Further, if Microsoft's is correct, how can
> one parenthesize a #define'd value to turn it into a character string?

According to ANSI, 'VAL' is substituted into the 'STR' macro, gets quoted,
and then is NOT subject to macro replacement as a result.

If you want to "parenthesize a #define'd value to turn it into a 
character string", try this:


#define VAL 3
#define STR(x) #x
#define XSTR(x) STR(x)

main()
{
   puts("val is " XSTR(VAL));
}


Now after VAL is substituted into XSTR, it has not yet been quoted, so
it is subject to macro replacement.  It appears that the Microsoft
compiler is ANSI-conforming here, whereas the Turbo C one is not.

scjones@sdrc.UUCP (Larry Jones) (01/20/89)

In article <531@ole.UUCP>, ray@ole.UUCP (Ray Berry) writes:
> 
> #define VAL 3
> #define STR(x) #x
> - - - - - - - -
>    Turbo C produces "3" for STR(VAL); Microsoft produces "VAL" and 
> leaves it at that.  
>    Which behavior is correct?  Further, if Microsoft's is correct, how can
> one parenthesize a #define'd value to turn it into a character string?

Microsoft is correct (Horray!!!); the stringize operator
stringizes the actual argument.  To stringize the replacement you
just need one more level of indirection:

	#define VAL 3
	#define STR(x) #x
	#define REPSTR(x) STR(x)

Now when you write REPSTR(VAL) it gets replaced with STR(3) which
is rescanned and replaced with "3".

----
Larry Jones                         UUCP: uunet!sdrc!scjones
SDRC                                      scjones@sdrc.UU.NET
2000 Eastman Dr.                    BIX:  ltl
Milford, OH  45150                  AT&T: (513) 576-2070
"When all else fails, read the directions."

gsf@ulysses.homer.nj.att.com (Glenn Fowler[eww]) (01/20/89)

In article <9433@smoke.BRL.MIL>, gwyn@smoke.BRL.MIL (Doug Gwyn ) writes:
> In article <531@ole.UUCP> ray@ole.UUCP (Ray Berry) writes:
> >#define VAL 3
> >#define STR(x) #x
> >  puts("val is " STR(VAL));
> >   Turbo C produces "3" for STR(VAL); Microsoft produces "VAL" and 
> >leaves it at that.  
> >   Which behavior is correct?
> 
> "Before being substituted, each argument's preprocessing tokens are
> completely macro replaced as if they formed the rest of the
> translation unit; no other preprocessing tokens are available."
> 
> So VAL is first expanded to 3 then the #3 is performed to produce
> "3".
unless things changed from the 86 draft, macro arguments subject to the
# and ## operators are not expanded before the # and ## operations, so
the above eventually produces:

	puts ( "val is VAL" ) ;

to get VAL expanded add one more macro level:

	#define STR(x)	#x
	#define XSTR(x)	STR(x)
	puts(STR(VAL) " is " XSTR(VAL));

to eventually get:

	puts ( "VAL is 3" ) ;
-- 
Glenn Fowler    (201)-582-2195    AT&T Bell Laboratories, Murray Hill, NJ
uucp: {att,decvax,ucbvax}!ulysses!gsf       internet: gsf@ulysses.att.com

gwyn@smoke.BRL.MIL (Doug Gwyn ) (01/21/89)

>> In article <531@ole.UUCP> ray@ole.UUCP (Ray Berry) writes:
>> >#define VAL 3
>> >#define STR(x) #x
>> >  puts("val is " STR(VAL));
>In article <9433@smoke.BRL.MIL>, gwyn@smoke.BRL.MIL (Doug Gwyn ) writes:
>> "Before being substituted, each argument's preprocessing tokens are
>> completely macro replaced as if they formed the rest of the
>> translation unit; no other preprocessing tokens are available."
>> So VAL is first expanded to 3 then the #3 is performed to produce
>> "3".
In article <11125@ulysses.homer.nj.att.com> gsf@ulysses.homer.nj.att.com (Glenn Fowler[eww]) writes:
>unless things changed from the 86 draft, macro arguments subject to the
># and ## operators are not expanded before the # and ## operations, ...

The problem is that the paragraph constituting Section 3.8.3.1 in the
final Draft has unclear binding of qualifications and things qualified.
With creative interpretation, it can be read either way.  Here it is in
its entirety (sentence numbers in left margin for reference):

[1]	After the arguments for the invocation of a function-like
	macro have been identified, argument substitution takes place.
[2]	A parameter in the replacement list, unless preceded by a # or
	## preprocessing token or followed by a ## preprocessing token
	(see below), is replaced by the corresponding argument after
[3]	all macros contained therein have been expanded.  Before being
	substituted, each argument's preprocessing tokens are completely
	macro replaced as if they formed the rest of the translation
	unit; no other preprocessing tokens are available.

The "(see below)" in sentence [2] refers to the sections that describe
the behavior of the # and ## operators.  Particularly relevant here is

[4]	If, in the replacement list, a parameter is immediately preceded
	by a # preprocessing token, both are replaced by a single
	character string literal preprocessing token that contains the
	spelling of the preprocessing token sequence for the
	corresponding argument.

Now, my reading of Section 3.8.3.1 [1] and [3] is that IN ALL CASES the
argument tokens are macro replaced before substitution, and my reading
of [2] is that the # and ## situations call for actions other than simply
"plugging in" the result of the argument macro expansion.  Sentences [1]
and [3] do NOT indicate "Except as noted in sentence [2], ...".

I can see how you might read sentence [2] as saying that in # and ##
context the arguments are not first macro-expanded, and the use of
xstr(s) in the complex example in Section 3.8.3.5 tends to reinforce
that notion.  Unfortunately there is not an unambiguous example given
for a case like the one under discussion.  I'm sure the Redactor has an
opinion about what was intended here, but then so do the authors of
Turbo C and MicroSoft C, and so far we've seen both interpretations.
If Bob Jervis and Dave Prosser both agree on how this is supposed to be
interpreted, then I'll bow to their opinion; they're pretty much the
X3J11 preprocessing gurus.  (If I've been misinterpreting this, then
I'd also recommend that ANSI C tutorials emphasize this item, because
it sure is easy to read the Standard the other way!)

johnny@edvvie.at (Johann Schweigl) (10/08/89)

How can I replace a text token by itself, along with some additional text?
Example: 
EXEC SQL select * from emp;     __curline = 23; EXEC SQL select * from emp;
before ^^^^^^^^^^^^^^^^^^^^     after ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

#define EXEC __curline = __LINE__; EXEC 
is not the solution, cpp would try to resolve EXEC recursively
I know, sed is the most natural solution, but I want to do all by including
a single file. ANSI is not spoken on my system.
Any help appreciated.
Thanks, johnny
-- 
This does not reflect the   | Johann  Schweigl | DOS?
opinions of my employer.    | johnny@edvvie.at | Kind of complicated
I am busy enough by talking |                  | bootstrap loader ...
about my own ...            |   EDVG  Vienna   | 

piet@cs.ruu.nl (Piet van Oostrum) (10/10/89)

In article <174@eliza.edvvie.at>, johnny@edvvie (Johann Schweigl) writes:
 `How can I replace a text token by itself, along with some additional text?
 `Example: 
 `EXEC SQL select * from emp;     __curline = 23; EXEC SQL select * from emp;
 `before ^^^^^^^^^^^^^^^^^^^^     after ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 `
 `#define EXEC __curline = __LINE__; EXEC 
 `is not the solution, cpp would try to resolve EXEC recursively

If your preprocessor is non-ANSI, you might try:

	#define EXEC __curline = __LINE__; EX/**/EC 

or something similar. Note that this is very unportable.
-- 
Piet van Oostrum, Dept of Computer Science, University of Utrecht
Padualaan 14, P.O. Box 80.089, 3508 TB Utrecht,  The Netherlands.
Telephone: +31-30-531806      Internet: piet@cs.ruu.nl
Telefax:   +31-30-513791      Uucp: uunet!mcsun!hp4nl!ruuinf!piet

ok@cs.mu.oz.au (Richard O'Keefe) (10/10/89)

In article <174@eliza.edvvie.at>, johnny@edvvie (Johann Schweigl) writes:
: How can I replace a text token by itself, along with some additional text?
: Example: 
: EXEC SQL select * from emp;     __curline = 23; EXEC SQL select * from emp;
: before ^^^^^^^^^^^^^^^^^^^^     after ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
: 
: #define EXEC __curline = __LINE__; EXEC 
: is not the solution, cpp would try to resolve EXEC recursively

Well, according to gcc, it _is_ the solution.  When I do
	% cat >zabbo.c <<end_of_file.
	#define EXEC __curline = __LINE__; EXEC
	EXEC SQL select * from exp;
	end_of_file.
	% gcc -ansi -pedantic -E zabbo.c
I get the output
	__curline = 2; EXEC  SQL select * from emp;
My understanding is that ANSI macro expansion explicitly checks for
recursion and blocks it, but my draft is over a year out of date and
in another country, so I'll just say "works in gcc".

dfp@cbnewsl.ATT.COM (david.f.prosser) (10/10/89)

In article <174@eliza.edvvie.at> johnny@edvvie.at (Johann Schweigl) writes:
 >How can I replace a text token by itself, along with some additional text?
 >Example: 
 >EXEC SQL select * from emp;     __curline = 23; EXEC SQL select * from emp;
 >before ^^^^^^^^^^^^^^^^^^^^     after ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 >
 >#define EXEC __curline = __LINE__; EXEC 
 >is not the solution, cpp would try to resolve EXEC recursively
 >I know, sed is the most natural solution, but I want to do all by including
 >a single file. ANSI is not spoken on my system.

With ANSI C, your example macro definition will work as you desire.

Dave Prosser	...not an official X3J11 answer...

schaefer@ogccse.ogc.edu (Barton E. Schaefer) (10/10/89)

In article <1679@ruuinf.cs.ruu.nl> piet@cs.ruu.nl (Piet van Oostrum) writes:
} In article <174@eliza.edvvie.at>, johnny@edvvie (Johann Schweigl) writes:
}  `How can I replace a text token by itself, along with some additional text?
}  `EXEC SQL select * from emp;     __curline = 23; EXEC SQL select * from emp;
}  `before ^^^^^^^^^^^^^^^^^^^^     after ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
} If your preprocessor is non-ANSI, you might try:
} 
} 	#define EXEC __curline = __LINE__; EX/**/EC 

This doesn't quite work; cpp apparently re-evaluates the token as soon
as it has stripped the /**/, so it becomes recursive.  To completely
fool cpp, you must use the horrible

	#define EXEC __curline = __LINE__; EX//**/**/**//EC 

} Note that this is very unportable.

No kidding.
-- 
Bart Schaefer      "A Yellowbeard is never so dangerous as when he's dead."
                                              -- Graham Chapman, 1941-1989
CSNET / Internet                schaefer@cse.ogc.edu
UUCP                            ...{sequent,tektronix,verdix}!ogccse!schaefer

leo@philmds.UUCP (Leo de Wit) (10/12/89)

In article <1679@ruuinf.cs.ruu.nl> piet@cs.ruu.nl (Piet van Oostrum) writes:
|In article <174@eliza.edvvie.at>, johnny@edvvie (Johann Schweigl) writes:
| `How can I replace a text token by itself, along with some additional text?
| `Example: 
| `EXEC SQL select * from emp;     __curline = 23; EXEC SQL select * from emp;
| `before ^^^^^^^^^^^^^^^^^^^^     after ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| `
| `#define EXEC __curline = __LINE__; EXEC 
| `is not the solution, cpp would try to resolve EXEC recursively
|
|If your preprocessor is non-ANSI, you might try:
|
|	#define EXEC __curline = __LINE__; EX/**/EC 
|
|or something similar. Note that this is very unportable.

Agreed; it does not work e.g. in Ultrix 2.x 8-) (recursive macro).

For the original poster: seems you want to know the current line of an
embedded SQL statement; if you happen to use Oracle as your DBMS, you
can include oraca.h and inspect the value of oraca.oraslnr (this is the
line number in the original source code, not the precompiled one).

    Leo.

barrett@jhunix.HCF.JHU.EDU (Dan Barrett) (02/27/91)

	GCC's preprocessor doesn't like this code:

		#include <ctype.h>
		#define ARGS		(x)

		main()
		{
			...	isalpha ARGS ...
		}

The linker complains that _isalpha is not known.  According to H&S and K&R,
you are allowed whitespace between the name of a macro and its argument list
in the invocation.

	GCC's documentation says that a macro invocation that is not
followed by a left parenthesis (ignoring whitespace) is not considered a
macro invocation.  That explains why GCC doesn't like the above program.
But is this the standard way an ANSI preprocessor should work?  I can see
advantages and disadvantages to this behavior.

                                                        Dan

 //////////////////////////////////////\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
| Dan Barrett, Department of Computer Science      Johns Hopkins University |
| INTERNET:   barrett@cs.jhu.edu           |                                |
| COMPUSERVE: >internet:barrett@cs.jhu.edu | UUCP:   barrett@jhunix.UUCP    |
 \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\/////////////////////////////////////

jik@athena.mit.edu (Jonathan I. Kamens) (02/27/91)

In article <7654@jhunix.HCF.JHU.EDU>, barrett@jhunix.HCF.JHU.EDU (Dan Barrett) writes:
|> 	GCC's preprocessor doesn't like this code:
|> 
|> 		#include <ctype.h>
|> 		#define ARGS		(x)
|> 
|> 		main()
|> 		{
|> 			...	isalpha ARGS ...
|> 		}

  I believe that GCC is correctly obeying the ANSI C standard here, although
it's hard to be sure.

  K&Rv2 says that an invocation of a macro with arguments must be the name of
the macro, followed by optional white space, followed by '('.  It doesn't say
that the '(' is allowed to be part of another macro that has been expanded.

  My guess is that the preprocessor is allowed to conclude as soon as it sees
a character that is not '(' after the name of the macro, that this isn't a
macro invocation.  If it couldn't do that, then every time macro substitution
happened, the preprocessor would have to check if the first non-whitespace
character of the substituted text was '(', and if so, go back and check if the
word before the '(' was the name of a macro with arguments.

  The relevant paragraph from K&Rv2 is the second paragraph on page 230.  You
should get yourself a copy :-).

-- 
Jonathan Kamens			              USnail:
MIT Project Athena				11 Ashford Terrace
jik@Athena.MIT.EDU				Allston, MA  02134
Office: 617-253-8085			      Home: 617-782-0710

gwyn@smoke.brl.mil (Doug Gwyn) (02/28/91)

In article <7654@jhunix.HCF.JHU.EDU> barrett@jhunix.HCF.JHU.EDU (Dan Barrett) writes:
>	GCC's preprocessor doesn't like this code:
>		#include <ctype.h>
>		#define ARGS		(x)
>		main()
>		{
>			...	isalpha ARGS ...
>		}
>The linker complains that _isalpha is not known.  According to H&S and K&R,
>you are allowed whitespace between the name of a macro and its argument list
>in the invocation.

Yes, but the preprocessing token after the identifier in the function-like
macro invocation must be a left parenthesis, not another identifier.  At
the point that "ARGS" has just been macro-replaced, "isalpha" has been
left in the dust, and will not be rescanned to attach the newly-appeared
left parenthesis to it.

>	GCC's documentation says that a macro invocation that is not
>followed by a left parenthesis (ignoring whitespace) is not considered a
>macro invocation.  That explains why GCC doesn't like the above program.
>But is this the standard way an ANSI preprocessor should work?  I can see
>advantages and disadvantages to this behavior.

If the identifier corresponds to a previously-defined object-like macro,
it would be macro-replaced.  In this case, however, the definition is for
a function-like macro, which is supposed to be processed as I described
above.

There IS an error in the implementation, however -- the standard C
library is required (for a conforming hosted implementation) to provide
a definition for the isalpha() function.  It may (and probably does)
also provide a macro definition for isalpha() in <ctype.h>, but the
function is also required, to support applications that choose not to
use the macro definition.

garry@ceco.ceco.com (Garry Garrett) (03/01/91)

In article <1991Feb27.155100.21972@athena.mit.edu>, jik@athena.mit.edu (Jonathan I. Kamens) writes:
> In article <7654@jhunix.HCF.JHU.EDU>, barrett@jhunix.HCF.JHU.EDU (Dan Barrett) writes:
* |> 	GCC's preprocessor doesn't like this code:
* |> 
* |> 		#include <ctype.h>
* |> 		#define ARGS		(x)
* |> 
* |> 		main()
* |> 		{
* |> 			...	isalpha ARGS ...
* |> 		}
* 
*   K&Rv2 says that an invocation of a macro with arguments must be the name of
* the macro, followed by optional white space, followed by '('.  It doesn't say
* that the '(' is allowed to be part of another macro that has been expanded.
* 

	Perhaps the problem is one wher GCC does not follow the standard.
Check to see that 
                 #define ARGS            (x)

is not interpreted as a macro of the form

		#define ARGS(x) ...

I know that it, shouldn't, but perhaps GCC is non-standard.