[comp.lang.c] Quote without comment on char constant expansion

gnu@hoptoad.uucp (John Gilmore) (04/12/88)

Date: Mon, 11 Apr 88 20:18:56 +0100
From: "Chuck Clanton" <sun!sunuk!pan!aratar!chac>
Subject: did you find this one?

we found an awful gcc problem in a compiling (preprocessing actually)
a program that fooled around with ioctl settings with #defines
in <sys/ioctl.h>...the gcc version didnt work, the pcc version did.
we investigated further and simplified the problem to the following
difference in interpretation:

    $ cat foo.c
    #define blah(t) 't'
    
    main()
    {
        printf("should this be a t or an f? %c\n", blah(f));
    }
    $ gcc foo.c        # gnu c compiler
    $ a.out
    should this be a t or an f? t
    $ pcc foo.c        # ultrix 2.0 pcc compiler
    $ a.out
    should this be a t or an f? f
    $ vcc foo.c        # vax ultrix c compiler
    $ a.out
    should this be a t or an f? f

on an IBM PC-RT under AIX (system V)
    $ cc foo.c
    $ a.out
    should this be a t or an f? f
    $
on a big VAXEN under BSD 4.3
    $ cc foo.c
    $ a.out
    should this be a t or an f? f
    $

it would seem that gcc is outvoted, though i am sympathetic to
its interpretation.  ritchie says "Text inside a string or a
character constant is not subject to replacement" in "The C
Programming Language" paper.  k&r merely mentions the string
issue.  however, the ritchie reference would appear to me to
qualify what triggers the macro expansion, not what the macro
expansion does once triggered.  so the language hair-splitters
can argue about this for a long time, but the language users have
a problem right now.  i am not sure where ansi sits on this one.
vcc and gcc both claim to be ansi compatible.

did you find this one in your compile of all of the berkeley 
software distribution?  if not, tell me where it should be 
reported and i will send it in.


Date: Mon, 11 Apr 88 23:33:19 +0100
From: "Chuck Clanton" <sun!sunuk!pan!aratar!chac>
Subject: gcc is surely wrong

well, i just came upon this further piece of information
that in fact it is perfectly permissible to expand inside
strings as well as character constants INSIDE the macro.
ritchie's comment is undoubtedly meant ONLY with regard
to what triggers the macro expansion.

this is from the system 5 assert.h:

	#define assert(EX) if (EX) ; else _assert("EX", __FILE__, __LINE__)

EX is the expression being validated by the assert.  the effect 
of "EX" in the _assert call is to print out the expression itself 
that has failed the assertion!

so, i suspect gcc is quite wrong.

-- 
{pyramid,pacbell,amdahl,sun,ihnp4}!hoptoad!gnu			  gnu@toad.com
  I forsee a day when there are two kinds of C compilers: standard ones and 
  useful ones ... just like Pascal and Fortran.  Are we making progress yet?
	-- ASC:GUTHERY%slb-test.csnet

mike@turing.UNM.EDU (Michael I. Bushnell) (04/12/88)

In article <4418@hoptoad.uucp> gnu@hoptoad.uucp (John Gilmore) writes:

He says that cccp (the GNU C Compatable Compiler Preprocessor) doesn't
expand macros when the expeted invocation occurs inside ' or ".

This is quite correct, and documented behavior.  In fact, gcc comes 
with directions on how to make an ioctl.h file that works.  The reason
for this behavior is the ANSI Proposed standard.  It dictates that 
macro expansion NOT occur within quote marks.  Instead, strinification
and concatenation preprocessor operators are provided.  Take a look
at the info file that comes with gcc.


                N u m q u a m   G l o r i a   D e o 

			Michael I. Bushnell
			HASA - "A" division
14308 Skyline Rd NE				Computer Science Dept.
Albuquerque, NM  87123		OR		Farris Engineering Ctr.
	OR					University of New Mexico
mike@turing.unm.edu				Albuquerque, NM  87131
{ucbvax,gatech}!unmvax!turing.unm.edu!mike

gwyn@brl-smoke.ARPA (Doug Gwyn ) (04/12/88)

In article <4418@hoptoad.uucp> gnu@hoptoad.uucp (John Gilmore) writes:
>this is from the system 5 assert.h:
>	#define assert(EX) if (EX) ; else _assert("EX", __FILE__, __LINE__)

AT&T of course will be fixing this when they release their ANSI-conforming
C language subsystem.  For now, they cheat just like everybody else who
uses a Reiser preprocessor, because they have no alternative.

>so, i suspect gcc is quite wrong.

No, GCC is correctly implementing this aspect of C, both K&R (as
interpreted by most C language lawyers) and ANSI.

karl@haddock.ISC.COM (Karl Heuer) (04/13/88)

In article <4418@hoptoad.uucp> gnu@hoptoad.uucp (John Gilmore) writes:
>[sun!sunuk!pan!aratar!chac (Chuck Clanton) writes:]
>>    #define blah(t) 't'
>>[gcc gives you 't', pcc and vcc and some others expand the argument]
>>it would seem that gcc is outvoted ... [k&r is unclear on this] ...
>>i am not sure where ansi sits on this one.  vcc and gcc both claim to be
>>ansi compatible.

gcc is right (from the ANSI view); vcc is not a conforming implementation.

>>this is from the system 5 assert.h:
>>	#define assert(EX) if (EX) ; else _assert("EX", __FILE__, __LINE__)

This will have to be fixed for the ANSI implementation of assert().

The original problem showed up in BSD <sys/ioctl.h>.  I would recommend that
this header be fixed as shown by the example below.

old> #define _IO(x,y)	(IOC_VOID|('x'<<8)|y)
old> #define TIOCHPCL	_IO(t, 2)

new> #define _IO(x,y)	(IOC_VOID|(x<<8)|y)
new> #define TIOCHPCL	_IO('t', 2)

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

gwyn@brl-smoke.ARPA (Doug Gwyn ) (04/14/88)

In article <3432@haddock.ISC.COM> karl@haddock.ima.isc.com (Karl Heuer) writes:
>The original problem showed up in BSD <sys/ioctl.h>.  I would recommend that
>this header be fixed as shown by the example below.

The important thing is that this can be fixed NOW -- you don't have
to wait for an ANSI C compliant compiler to apply the suggested fix.
Then when 4BSD switches to an ANSI C compiler this wouldn't break.

chris@mimsy.UUCP (Chris Torek) (04/14/88)

[#define _IO(x,y) (IOC_VOID|('x'<<8)|y), a `Reiserism', has no counterpart
in the dpANS]

>In article <3432@haddock.ISC.COM> karl@haddock.ima.isc.com (Karl Heuer) writes:
>>The original problem showed up in BSD <sys/ioctl.h>.  I would recommend that
>>this header be fixed ....

In article <7677@brl-smoke.ARPA> gwyn@brl-smoke.ARPA (Doug Gwyn ) writes:
>The important thing is that this can be fixed NOW -- you don't have
>to wait for an ANSI C compliant compiler to apply the suggested fix.
>Then when 4BSD switches to an ANSI C compiler this wouldn't break.

It not only can be fixed, it has been fixed.  But that is not the
`important thing'.  The `important thing' is that the Reiser
preprocessor offers one function---turning a macro argument into a
character constant---that cannot be performed by a preprocessor that
works as defined in the dpANS.  Without the introduction of the
so-called `stringize' operator `#', there would have been two such
functions, and so (the argument goes) perhaps there should be a
`charize' operator.

The argument against this, of course, is that somehow `stringize'
is useful while `charize' is not, or not enough so.  The argument
holds up to some extent when one considers another change from
existing practise: string concatenation.  A debug-variable-value
macro can now be written as follows:

	#ifdef DEBUG
	#define printint(x) (void) printf("DEBUG: " #x " = %d\n", x)
	#else
	#define printint(x)
	#endif

Personally, I never liked the Reiser behaviour; I found it rather
counterintuitive.  I do wonder, though, whether the loss of a
macro-argument-to-character-constant function will prove more
significant than it already has (in the `CTRL' macro---the ioctl
macros are much better confined and hence easier to alter).
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

gwyn@brl-smoke.ARPA (Doug Gwyn ) (04/14/88)

In article <11056@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
>perhaps there should be a `charize' operator.

It's been voted down (back when we hadn't settled on # and ## yet).
There has been a comment that &"..." should be allowed in constant
expressions.

tainter@ihlpg.ATT.COM (Tainter) (04/15/88)

In article <11056@mimsy.UUCP>, chris@mimsy.UUCP (Chris Torek) writes:
> The argument against this, of course, is that somehow `stringize'
> is useful while `charize' is not, or not enough so.

Is

    "ABCD"[0]
								    TM
legal ANSI C?  If not, why not?  It certainly works in AT&T R&D UNIX   C.

If so then

    (#x[0])

becomes a 'charize' expression equivalent to a direct charize operator for
all intents and purposes.  I can't seriously imagine a compiler not optimizing
this expression to the character constant.

> In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)


--j.a.tainter

flaps@dgp.toronto.edu (Alan J Rosenthal) (04/16/88)

I'm using the convention of quoting C statements with backquotes since
I need to use forward-quotes and double-quotes.

Discussing that there is no way to write the (revolting in my opinion)
CTRL macro where `CTRL(a)` substitutes to `('a' & 037)`,
tainter@ihlpg.ATT.COM first asks if `"ABCD"[0]` isn't indeed legal ANSI
C (it is), and then assuming that it is says that `(#x[0])`

>becomes a 'charize' expression equivalent to a direct charize operator
>for all intents and purposes.  I can't seriously imagine a compiler not
>optimizing this expression to the character constant.

The optimization is not the problem.  What the problem is is that this
is not a character constant, it is a character-valued expression.  So
where a constant expression is required this cannot be used.  The most
common example for the CTRL macro is a switch statement, in which the
cases must be constant expressions.

ajr

--
":= has got to be the most ugly, most bogus pile of sh*t ever invented,
 but that's my personal opinion."     -- Johnson Noise

cudcv@daisy.warwick.ac.uk (Rob McMahon) (04/16/88)

In article <7683@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>In article <11056@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
>>perhaps there should be a `charize' operator.
>
>It's been voted down (back when we hadn't settled on # and ## yet).
>There has been a comment that &"..." should be allowed in constant
>expressions.

Maybe now that # & ## have been settled on, it's time for another vote ?
It seems it ought to be there just for consistency, and it would provide
a much cleaner solution than this #c[0] as a constant (is that right ?).
Out of interest, what were the arguments against it ?

Rob


-- 
UUCP:   ...!mcvax!ukc!warwick!cudcv	PHONE:  +44 203 523037
JANET:  cudcv@uk.ac.warwick.cu          ARPA:   cudcv@cu.warwick.ac.uk
Rob McMahon, Computing Services, Warwick University, Coventry CV4 7AL, England

rbutterworth@watmath.waterloo.edu (Ray Butterworth) (04/26/88)

In article <522@sol.warwick.ac.uk>, cudcv@daisy.warwick.ac.uk (Rob McMahon) writes:
> In article <7683@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
> >In article <11056@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
> >>perhaps there should be a `charize' operator.
> >
> >It's been voted down (back when we hadn't settled on # and ## yet).
> >There has been a comment that &"..." should be allowed in constant
> >expressions.
> 
> Maybe now that # & ## have been settled on, it's time for another vote ?
> It seems it ought to be there just for consistency, and it would provide
> a much cleaner solution than this #c[0] as a constant (is that right ?).
> Out of interest, what were the arguments against it ?

Better yet, if some future version of the Standard decides that
a "charize" operator would be a good thing, what on Earth are they
going to call it?  ### ?

But doesn't "x###y" already have a meaning as "x ## # y"?  i.e. x"y" ?
Yes, but ### is a longer string, so it is taken as charize just
as x+++++y is taken as the illegal "x++ ++ + y" rather than the
meaningful "x++ + ++y".

But that means that introducing this will break any code that
now thinks that ### means ## #.  And there might be some in
light of the new "L" string prefix.  e.g.
#define STRING(value) L###value
so that STRING(abc) generates L"abc".

And if they ever need another new operator, will it be #### ?
Good grief!

Why did the committee come up with such a limited and non-obvious
naming scheme for the preprocessor operators?

What would have been wrong with something like:
Current      Renamed        Resyntaxed      Meaning
#x           #string x      #string(x)      "x"
x##y         x #glue y      #glue(x,y)      xy
-----        #character x   #character(x)   'x'
defined(x)   #defined(x)    #defined(x)     preprocessor token x is defined

#define STRING(value) L #join #string value
or
#define STRING(value) #join(L,#string(value))


These new names (or better yet, the new syntax) is certainly
more obvious than the ## and # names, it doesn't step on
any name space other than the current preprocessor directive,
and it allows new preprocessor operators to be added with little
difficulty.

chris@mimsy.UUCP (Chris Torek) (04/28/88)

[various depths of quoting deleted]
In article <11056@mimsy.UUCP> I wrote
>perhaps there should be a `charize' operator.

(Please note that this is someone else's argument; I am not claiming
it as my own.  After all the trimming this became unclear.  For one
thing, I would probably not call it `charize': `stringize' is an ugly
word, and `charize' is worse.  Ah, aesthetics. :-) )

In article <18523@watmath.waterloo.edu> rbutterworth@watmath.waterloo.edu
(Ray Butterworth) writes:
>Better yet, if some future version of the Standard decides that
>a "charize" operator would be a good thing, what on Earth are they
>going to call it?  ### ?

>Why did the committee come up with such a limited and non-obvious
>naming scheme for the preprocessor operators?

In fact, there was a proposal for a more rational scheme, similar
to either your `renamed' or `resyntaxed' versions---I have forgotten
the details---but it got bogged down somehow, and eventually vanished.
A minor tragedy, to be sure, but, I think, a tragedy nonetheless.

>Current      Renamed        Resyntaxed      Meaning
>#x           #string x      #string(x)      "x"
>x##y         x #glue y      #glue(x,y)      xy
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris