[comp.lang.c] Concatenating with a compile-time definition in "ANSI" CPP

jacob@gore.com (Jacob Gore) (09/23/89)

Can the following be achieved with "ANSI" cpp:

	MtaInit()

expands to

	SomeNameInit()

where Mta=SomeName is specified at compile-time, WITHOUT enumerating every
possible value of Mta in an #ifdef?

I want to be able to, for example, use -DMta=Sendmail in cc's command line,
and have that result in "MtaInit()" replaced with "SendmailInit()" in the code.
Is there a way to do this?

Jacob
--
Jacob Gore	Jacob@Gore.Com		{boulder,nucsrl}!gore!jacob

henry@utzoo.uucp (Henry Spencer) (09/24/89)

In article <470004@gore.com> jacob@gore.com (Jacob Gore) writes:
>I want to be able to, for example, use -DMta=Sendmail in cc's command line,
>and have that result in "MtaInit()" replaced with "SendmailInit()" in the code.
>Is there a way to do this?

In a word, no.  (Well, not unless you commit vile acts with the obscure
token-concatenation operator, and set up your program to perform said vile
acts on every such identifier.)

You want a general text-manipulation tool like sed, not the C preprocessor.
Try "sed 's/Mta/Sendmail/g' file.proto >file.c".
-- 
"Where is D.D. Harriman now,   |     Henry Spencer at U of Toronto Zoology
when we really *need* him?"    | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

datanguay@watmath.waterloo.edu (David Adrien Tanguay) (09/24/89)

In article <470004@gore.com> jacob@gore.com (Jacob Gore) writes:
>Can the following be achieved with "ANSI" cpp:
>
>	MtaInit()
>
>expands to
>
>	SomeNameInit()
>
>where Mta=SomeName is specified at compile-time, WITHOUT enumerating every
>possible value of Mta in an #ifdef?

#define join(a,b)  a ## b
#define join2(a,b) join( a, b )
#define MtaInit    join2( Mta, Init )

Mta can then be defined with -DMta=something. Init must not be a macro.
The join2 is there so that the Mta will get expanded.
The standard's at work, I'm not; caveat emptor.

David Tanguay

hunt@ernie.Berkeley.EDU (Jim Hunt) (10/03/89)

In article <29351@watmath.waterloo.edu> datanguay@watmath.waterloo.edu (David Adrien Tanguay) writes:
>In article <470004@gore.com> jacob@gore.com (Jacob Gore) writes:
>>Can the following be achieved with "ANSI" cpp:
>>	MtaInit()
>>expands to
>>	SomeNameInit()
>>where Mta=SomeName is specified at compile-time, WITHOUT enumerating every
>>possible value of Mta in an #ifdef?
>
>#define join(a,b)  a ## b
>#define join2(a,b) join( a, b )
>#define MtaInit    join2( Mta, Init )
>David Tanguay

You can do this in non ANSI environments (at least Sun) with
#define MyInit(arg)   arg/**/Init
which is an ugly hack on the preprocessor, that is also defined
in the ANSI C preprocessor!  I had no idea what the silly ##
operator was for until I came up with the following.  I had the
desire to be able to say p=NewNodetype1() or q=NewNodetype2()
without having to separately define all of the NewNodetype?()
functions.  My memory of this has faded, but....
(The NodeType* were defined elsewhere in some .h)
nodes.c contained many loops of:

#define magic(arg) arg/**/NodeType1 /* or use ## if ANSI */
#include x.c

along with suitable #undefs
and x.c contained

magic()* magic(LOAS) = (magic()*)0;
magic()* magic(New)() { if (magic(LOAS)==(magic()*)0) ( ... }
/* work this through three times, different ways, before flaming! */
/* yes, lint complained about arg counts */

This allowed me to have one file of routines to handle the LOAS
(list of available space), New, and Free routines, that was
expanded to the names I wanted.  I did NOT want to have to call
from an array of functions, or have other weird ways to deal with
this.  The many News and Frees done this way did take up all the
space of many copies, but were perfect for a development
environment where the node types changed from day to day.  (I
would not consider releasing code done this way.)  They also
supported counters (magic(OUT) & magic(FREED)) so I could keep
track of node usage too.  The great part was that x.c was only 33
lines long, including the page-at-a-time allocation for garbage
collection support.  I sweated til I was absolutly SURE of those
33 lines, but then I used them many times.  I argue that it is
easier to be perfect for 33 lines than over many copies, or some
complicated array of functions, or generic routines that you end
up changing every other day, or ...

:-) if I really thought this was a good idea, would I be trying
so hard to defend it?!?  I used it, it was cute, but......

Is this what the ## operator is for???  If not, what IS it for?

jim hunt@ernie.Berkeley.EDU	H&H Enterprises
These ARE the bosses opinions, I AM the * boss!!!
grad UCB, MS EE/CS, May 90, resume on request.
Newsgroups: comp.lang.c
Subject: Re: Concatenating with a compile-time definition in "ANSI" CPP
Summary: 
Expires: 

gwyn@smoke.BRL.MIL (Doug Gwyn) (10/04/89)

In article <17936@pasteur.Berkeley.EDU> hunt@ernie.Berkeley.EDU.UUCP (Jim Hunt) writes:
>You can do this in non ANSI environments (at least Sun) with
>#define MyInit(arg)   arg/**/Init
>which is an ugly hack on the preprocessor, that is also defined
>in the ANSI C preprocessor!

What are you talking about?  That's two tokens, not one spliced one,
in both Standard C and K&R C.  It was the UNIX (Reiser) cpp that was
responsible for this misimplementation becoming widespread.

hunt@ernie.Berkeley.EDU (Jim Hunt) (10/04/89)

In article <11212@smoke.BRL.MIL> gwyn writes:
>In <17936@pasteur.Berkeley.EDU> hunt@ernie.Berkeley.EDU.UUCP (Jim Hunt) writes:
>>You can do this in non ANSI environments (at least Sun) with
>>#define MyInit(arg)   arg/**/Init
>What are you talking about?  That's two tokens, not one spliced one,
>in both Standard C and K&R C.  It was the UNIX (Reiser) cpp that was
>responsible for this misimplementation becoming widespread.

Exactly, and it still works.  I think this is what the ## was
INVENTED for.  To make the standard conform to the existing
bugs/flaws/features in most systems.  I mentioned that so those
who do NOT have ansi compilers, can have a method to do all the
things that ANSI is supposed to bring down from the Gods.

I have tried it on all machines I have access to, and I still
haven't found one where it fails.  I would rather do ##, but
until all non-ansi compilers are erased from the disks of the
world, this is an option to remember.

Question, what do ansi compilers do with that?  They don't HAVE
to have a pre-processor, but who doesn't, and if there is a PP,
which eliminates /*comments*/, what token does it put in to
identify where comments were?  Would you complicate your yacc by
allowing comment tokens everywhere?  NO!  I guess the solution
would be in the lex phase, but I still think you are fighting
hard to eliminate a rather harmless artifact of the fact that
you use a preprocessor.

jim hunt@ernie.Berkeley.EDU	H&H Enterprises (1 employee)
These ARE the bosses opinions, I AM the * boss!!!
grad UCB, MS EE/CS, May 90, resume on request.

ok@cs.mu.oz.au (Richard O'Keefe) (10/04/89)

In article <17936@pasteur.Berkeley.EDU>, hunt@ernie.Berkeley.EDU (Jim Hunt) writes:
> You can do this in non ANSI environments (at least Sun) with
> #define MyInit(arg)   arg/**/Init
> which is an ugly hack on the preprocessor, that is also defined
> in the ANSI C preprocessor!

Yes, the ANSI standard does define the behaviour of arg/**/Init, but
it defines it *NOT* to work.  In "Reiser" preprocessors, comments got
turned into nothing at all.  In ANSI preprocessors, comments get turned
into one blank.  So MyInit(Bogus) would turn into "BogusInit" in a
"Reiser" preprocessor, but into "Bogus Init" in an ANSI-conformant one.

I have found gcc extremely helpful for coming to grips with things like
this; I use "gcc -ansi -pedantic" all the time just to be safe.  If my
code works under both that and "gcc -traditional" and if "lint" likes it
I begin to feel that I may not have overlooked obvious non-portabilities.

fredex@cg-atla.UUCP (Fred Smith) (10/04/89)

In article <17975@pasteur.Berkeley.EDU> hunt@ernie.Berkeley.EDU.UUCP (Jim Hunt) writes:
>In article <11212@smoke.BRL.MIL> gwyn writes:
>>In <17936@pasteur.Berkeley.EDU> hunt@ernie.Berkeley.EDU.UUCP (Jim Hunt) writes:
>>>You can do this in non ANSI environments (at least Sun) with
>>>#define MyInit(arg)   arg/**/Init
>  <stuff deleted>
>Question, what do ansi compilers do with that?  They don't HAVE
>to have a pre-processor, but who doesn't, and if there is a PP,
>which eliminates /*comments*/, what token does it put in to
>identify where comments were?  Would you complicate your yacc by
>allowing comment tokens everywhere?  NO!  I guess the solution
>would be in the lex phase, but I still think you are fighting
>hard to eliminate a rather harmless artifact of the fact that
>you use a preprocessor.
>



Well, Microsoft C may not be a flawless ANSI implementation, but it is somewhere
in the ballpark, and when you give it that construct it thinks you are talking
about two different items named arg, and Init. This is reasonable, and I think
correct behavior. If one reads K&R, or ANSI standard, one will see that they both
state that comments in the source are replaced with WHITE SPACE in the preprocessor
phase of translation. Clearly, if comments become whtie space, then this ugly
hack (i.e., "/**/", which is, after all, a comment!!) will not result in arg and
Init being pasted together!!!!  

This usage, then, is clearly an aberration on what the language was originally
intended and documented to be. Not to say that it isn't useful, though, which is,
I am sure, why the standard committee chose to specify an official, blessed, 
way to doing it.

Fred Smith

ndjc@capone.UUCP (Nick Crossley) (10/05/89)

In article <11212@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
>In article <17936@pasteur.Berkeley.EDU> hunt@ernie.Berkeley.EDU.UUCP (Jim Hunt) writes:
>>You can do this in non ANSI environments (at least Sun) with
>>#define MyInit(arg)   arg/**/Init
>>which is an ugly hack on the preprocessor, that is also defined
>>in the ANSI C preprocessor!
>
>What are you talking about?  That's two tokens, not one spliced one,
>in both Standard C and K&R C.  It was the UNIX (Reiser) cpp that was
>responsible for this misimplementation becoming widespread.

He might be referring to the preprocessor that is part of the AT&T
Unix V.4 compilation system.  In that, if the 'transition mode' flag
-Xt is given to cc, most if not all of the "Reiserisms", including
parameter substitution is strings and token pasting with comments,
are allowed, with a warning.  Since -Xt is the default at the moment,
it appears to the user that an 'ANSI C compiler' supports this
behaviour.  Of course, AT&T said at the V.4 developers conferences
that the default mode will not be -Xt in the next release, so you
should abviously start converting your code to the correct ANSI C
preprocessor forms, and certainly not write new code using the old
forms (at least, not without appropriate #ifs).
-- 

<<< standard disclaimers >>>
Nick Crossley, ICL NA, 9801 Muirlands, Irvine, CA 92718-2521, USA 714-458-7282
uunet!ccicpg!ndjc  /  ndjc@ccicpg.UUCP

karl@haddock.ima.isc.com (Karl Heuer) (10/05/89)

In article <17975@pasteur.Berkeley.EDU> hunt@ernie.Berkeley.EDU (Jim Hunt) writes:
JH>#define MyInit(arg)   arg/**/Init

DG>[This is an accident of a bug in the Reiser cpp]

JH>Exactly, and it still works.

Yes, if you happen to be using a non-ANSI compiler with the Reiser cpp, it
works.

JH>I think this is what the ## was INVENTED for.  To make the standard conform
JH>to the existing bugs/flaws/features in most systems.

That's a strange way of putting it, since the Committee explicitly did NOT
bless the existing bug.  It recognized the need for token-pasting, and,
finding no acceptable method in current practice, invented one.

JH>[I'd prefer ##, but not all compilers are ANSI]

I'll mention once again that the Reiserism does NOT work on ANSI compilers.
Thus the correct way to do this, if you're willing to assume that you'll never
have to port to a non-ANSI non-Reiser compiler (and that __STDC__ is not set
by non-ANSI compilers!), is with something like
	#if __STDC__
	#define glue(x,y) x ## y
	#else
	#define glue(x,y) x/**/y
	#endif

>Question, what do ansi compilers do with [a comment]?

They replace it with a blank.  Pretty simple, eh?

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (10/05/89)

In article <264@capone.UUCP>, ndjc@capone.UUCP (Nick Crossley) writes:

|  He might be referring to the preprocessor that is part of the AT&T
|  Unix V.4 compilation system.  In that, if the 'transition mode' flag
|  -Xt is given to cc, most if not all of the "Reiserisms", including
|  parameter substitution is strings and token pasting with comments,
|  are allowed, with a warning.  

  What is the level of ANSI compliance without -Xt (or with whatever
option is needed). Are there any major new features which are missing in
whatever passes for ANSI mode? Can ANSI be made default?
-- 
bill davidsen	(davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen)
"The world is filled with fools. They blindly follow their so-called
'reason' in the face of the church and common sense. Any fool can see
that the world is flat!" - anon

dfp@cbnewsl.ATT.COM (david.f.prosser) (10/07/89)

In article <834@crdos1.crd.ge.COM> davidsen@crdos1.UUCP (bill davidsen) writes:
>  What is the level of ANSI compliance [in the AT&T compiler] without -Xt (or
>with whatever option is needed). Are there any major new features which are
>missing in whatever passes for ANSI mode? Can ANSI be made default?

No major new features are missing in any mode.  A shell script can be used
as a front end to cc(1) if a different default mode is desired, for example.

There are three modes specifying different levels of conformance.
 -Xt  (Transition)  This is the initial default mode.
 -Xa  (ANSI C)      This will become the default in a subsequent release.
 -Xc  (Conforming)  This won't ever be the default.

All three modes include as many features of ANSI C as can fit.  The major
difference between -Xt and -Xa is the integral promotion rules.  (These
have been discussed many times already in this group and comp.std.c.)  All
expressions that might behave differently depending on the different
promotion rules are warned about in all modes.  A simple use of a cast is
all that is needed to eliminate these warnings.

There are only two differences between -Xa and -Xc.  The important one is
that -Xc restricts the name space to that required for ANSI C conformance.
The other is that all the required diagnostics are issued as required in
-Xc.

__STDC__ is predefined in all modes.  It is replaced by 0 in -Xt and -Xa.
(This is viewed as inappropriate by some people.)  Its value is 1 only in
-Xc, thus specifying true conformance.

Where ever possible, those features of AT&T's previous C compilers and
preprocessors that did not produce warnings and that are not compatible
with ANSI C are available in -Xt, but will produce warnings.  If the
feature is simply incompatible with ANSI C, it is not available in -Xa
or -Xc.  If a feature is a compatible extension, it will exist in all
modes (although some do require a diagnostic in -Xc).

If you want more information send me mail.
Dave Prosser

davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (10/19/89)

In article <17975@pasteur.Berkeley.EDU>, hunt@ernie.Berkeley.EDU (Jim Hunt) writes:

|  Question, what do ansi compilers do with that?  They don't HAVE
|  to have a pre-processor, but who doesn't, and if there is a PP,
|  which eliminates /*comments*/, what token does it put in to
|  identify where comments were?  

  Comments are replace with one blank. I think consecutive whitespace is
stripped, too, but my standard isn't handy.

#define foo(a,b) a/**/b

doesn't work for concat in an ANSI compiler for just this reason, but
your suggestion that it works on pcc is correct (in most cases). A
conditional definition of the macro will probably help.
-- 
bill davidsen	(davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen)
"The world is filled with fools. They blindly follow their so-called
'reason' in the face of the church and common sense. Any fool can see
that the world is flat!" - anon