[comp.lang.c] Portability and /**/

gardner@prls.UUCP (Robert Gardner) (07/16/87)

We recently received a large amount of code written for our
Unix VAX in C.  It's device driver routines make extensive use
of an interesting construct -- a rather large number of #defines
are of the form:
#define a(n) a/**/n
so that, for instance, a(2) expands to a2.  (Someone tried to tell
me that /**/ was a special c concatenation operator!:)  This somehow
made it easier to write code for many devices -- you choose the
device you have, include its header, and compile.
I seem to remember, though I can't find it, that K&R says that comments
can occur anywhere white space is allowed, which suggests that this
construct should not work.  In pursuing this (out of curiosity), I
found that the Ultrix VAX c compiler compiles the program
    main()
    {
       int n2=2;
       printf("%d\n",n/*comment*/2);
    }
and prints out '2' when executed.  This suggests that the pre-processor
simply removes comments, rather than replacing them with white space.
My Lightspeed C compiler at home will not compile the above program.
Any comments on portability?

Robert Gardner

guy%gorodish@Sun.COM (Guy Harris) (07/16/87)

> Any comments on portability?

Yup.  The "/**/" hack is NOT portable.  Plenty of other hacks
supported by the "Reiser preprocessor" supplied with most UNIX
versions are non-portable as well.  The ANSI C draft proposes a
different mechanism for doing token concatenation.
	Guy Harris
	{ihnp4, decvax, seismo, decwrl, ...}!sun!guy
	guy@sun.com

mpl@sfsup.UUCP (M.P.Lindner) (07/17/87)

In article <4983@prls.UUCP>, gardner@prls.UUCP writes:
> We recently received a large amount of code written for our
> Unix VAX in C.  It's device driver routines make extensive use
> of an interesting construct -- a rather large number of #defines
> are of the form:
> #define a(n) a/**/n

The C++ compiler generates things of this form to concatenate names for the
same reason - you avoid naming conflicts by concatenating the name with some
type information or something.  Apparently, this is making use of an obscure
BUG in the C preprocessor as implemented under UNIX(R) SYS V.  Before you send
flames that this is NOT a bug, realize that, as the poster of the original
article pointed out, by not replacing the comment with white space, the
preprocessor is changing the semantics of the code!  I know, it's SUPPOSED to
change the code, but this is abominable!  K&R specify that comments act as white
space and may be used anywhere white space may be used.  This implies that
	a/**/b == a b
which is NOT the case!

I therefore claim the preprocessor is broken and if you want to concatenate
symbols, perhaps you should design a feature in the preprocessor to do so,
rather than exploit a bug!  Oh, by the way, BSD systems use

#define blah(a,b)	a\
b

to do the same thing (concatenate a and b).  This seems more reasonable,
since K&R state that a '\' and an immediately following newline are ignored
(at least in strings).

BOTH of these methods has the disadvantage that white space in the arg list
will break the intention of the macro, so they don't preserve the "free format"
concepts of C.  For example:
	blah(abc, def)
will NOT work, while
	blah(abc,def)
will.  This took me 3 days to debug when I was learning C++.

alastair@geovision.UUCP (Alastair Mayer) (07/28/87)

In article <4983@prls.UUCP> gardner@prls.UUCP (Robert Gardner) writes:
>We recently received a large amount of code written for our
>Unix VAX in C.  It's device driver routines make extensive use
>of an interesting construct -- a rather large number of #defines
>are of the form:
>#define a(n) a/**/n
>so that, for instance, a(2) expands to a2. 
> [...]  In pursuing this (out of curiosity), I
>found that the Ultrix VAX c compiler compiles the program
>    main()
>    {
>       int n2=2;
>       printf("%d\n",n/*comment*/2);
>    }
>and prints out '2' when executed.  This suggests that the pre-processor
>simply removes comments, rather than replacing them with white space.
>My Lightspeed C compiler at home will not compile the above program.
>Any comments on portability?
>
>Robert Gardner

You are right, the C pre-processor does (normally) strip out the comments.
However, the exact semantics depend on the particular version (BSD or
SYSV) of 'cpp'.  There *are* uses for this -- you note one.  The C++
compiler (which also uses 'cpp') uses a similar mechanism to allow definition
of generic classes.  The file <generic.h> has code something like:

    #ifdef BSD  /* BSD way */
    #define name2(a,b) a\
    b
    #else       /* System V way */
    #define name2(a,b) a/**/b
    #endif

You might try the former method with your Lightspeed C (never having used
it, I dunno if it'll work either).

As to portability, if you *must* catenate tokens like this, then #define
a macro to do it such as the above.
-- 
 Alastair JW Mayer     BIX: al
                      UUCP: ...!utzoo!dciem!nrcaer!cognos!geovision!alastair

(Why do they call it a signature file if I can't actually *sign* anything?)

hunt@spar.SPAR.SLB.COM (Neil Hunt) (07/31/87)

alastair@geovision.UUCP (Alastair Mayer) writes:
>    #ifdef BSD  /* BSD way */
>    #define name2(a,b) a\
>    b
>    #else       /* System V way */
>    #define name2(a,b) a/**/b
>    #endif

I have seen a different construction, which looks more portable to me:

#define Same(token)		token
#define	Join(left, right)	Join(left)right

Neil/.

guy%gorodish@Sun.COM (Guy Harris) (07/31/87)

> You are right, the C pre-processor does (normally) strip out the comments.
> However, the exact semantics depend on the particular version (BSD or
> SYSV) of 'cpp'.  There *are* uses for this -- you note one.  The C++
> compiler (which also uses 'cpp') uses a similar mechanism to allow definition
> of generic classes.  The file <generic.h> has code something like:
> 
>     #ifdef BSD  /* BSD way */
>     #define name2(a,b) a\
>     b
>     #else       /* System V way */
>     #define name2(a,b) a/**/b
>     #endif

I don't understand this.  I have seen this several times, and I still
don't understand this.  Why is the second way labeled "System V
way", and the first way labeled "BSD way"?

	gorodish$ rlogin sunvax
	Password:
	Last login: Sat Jul 18 00:53:39 from gorodish
	4.3 BSD UNIX #1: Mon Jul  7 11:46:47 PDT 1986

	*** N.B.: Running straight 4.3bsd. ***
	% ed foo.c
	?foo.c
	a
	#define name2(a,b)      a/**/b

	name2(foo,bar)
	.
	w
	42
	q
	% cc E foo.c
	# 1 "foo.c"


	        foobar

The second way, mislabeled "System V way", works just fine on 4.3BSD;
the C preprocessor hasn't changed much at all between 4.1BSD and
4.3BSD, other than having the "-M" flag added.  The "System V way" is
used in several places by 4BSD Makefiles.

If you're using a Reiser preprocessor, the "System V way" will almost
certainly work.  If you're not using a Reiser preprocessor, *neither*
of the ways is guaranteed to work.  (If you're using an ANSI C
preprocessor, there is, of course, a standard way of doing this that
*is* guaranteed to work, unless they change that part of the standard
before it becomes final.)

Note, by the way, that due to the Reiser preprocessor's Principle of
Least Surprise-violating way of handling white space in macro calls,
the following call of "name2" will not work with *either* of the two
definitions above:

	name2(foo, bar)

It expands to "foo bar", because the second argument is " bar", not
"bar".
	Guy Harris
	{ihnp4, decvax, seismo, decwrl, ...}!sun!guy
	guy@sun.com

gwyn@brl-smoke.ARPA (Doug Gwyn ) (08/02/87)

In article <167@spar.SPAR.SLB.COM> hunt@spar.UUCP (Neil Hunt) writes:
>#define Same(token)		token
>#define	Join(left, right)	Join(left)right

Presumably the last line should have read
	#define	Join(left,right)	Same(left)right
but that still doesn't work in general, since the token resulting
from the macro expansion should not be spliced onto the token to
its right, at least not during macro scanning and replacement.

I think the outcome of the X3J11 work on this is that preprocessing
will be forced to tokenize, and splicing will require explicit use
of the new ## operator.  Meanwhile, the above trick (with the bug
fixed) is probably the most universal approach.  Be careful not to
include whitespace in the macro invocation argument list.