ms@security.UUCP (04/15/87)
Sorry if this has been discussed before in reference to X3J11 discussions, or other C problems encountered, but I haven't been reading this newsgroup very long. I have a question concerning the expansion of a #define by cpp on a Sun 3/160 running UNIX 4.2BSD version 3.2 from Sun Microsystems. Consider the following macro (this is not the actual code used, but will serve as a typical example): #define MACRO(first,last) (\ first\ _\ last) The intent of the developers was that when cpp expands the code, the two macro arguments would be concatenated together into one token for the compiler, i.e: MACRO(holy,cow) yields (holy_cow). However our cpp under 3.2 insists on replacing the escaped newlines in the macro with spaces, i.e: (holy _ cow), which the compiler then spits out. I tried this same example on Sun 3/160 running 3.0, and a Vax 11/780 running 4.2BSD, and both expanded the macro without the spaces. A quick glance through K&R did not yield any insight into which expansion is correct, but I may have overlooked something. So, my questions are: Which is the correct expansion, or is it left to the cpp implementors? Is there a problem with the (our?) Sun version 3.2 cpp? What (if anything) does the new standard say about this? Many thanks for any assistance you can provide. Jay W. Davison (Mistress Account) decvax!linus!security!ms jwd@mitre-bedford.arpa
gwyn@brl-smoke.ARPA (Doug Gwyn ) (04/16/87)
In article <2857@linus.UUCP> ms@security.uucp (Mistress Account) writes: >The intent of the developers was that when cpp expands the code, the two >macro arguments would be concatenated together into one token for the >compiler, i.e: MACRO(holy,cow) yields (holy_cow). Unfortunately C a la K&R did not provide any way to do such "token pasting". People using the Reiser CPP (found on most UNIX systems) typically resorted to the following trick: #define GLUE(a,b) a/**/b However, this is not guaranteed to work and, indeed, is guaranteed NOT to work in X3J11-compliant compilers. The X3J11 invention for token pasting is: #define GLUE(a,b) a ## b You will probably not find many compilers implementing this yet, since this part of the spec kept changing.
gemini@homxb.UUCP (04/17/87)
In article <5764@brl-smoke.ARPA>, gwyn@brl-smoke.UUCP writes: > Unfortunately C a la K&R did not provide any way to do such > "token pasting". People using the Reiser CPP (found on most > UNIX systems) typically resorted to the following trick: > #define GLUE(a,b) a/**/b > However, this is not guaranteed to work and, indeed, is > guaranteed NOT to work in X3J11-compliant compilers. The > X3J11 invention for token pasting is: > #define GLUE(a,b) a ## b > You will probably not find many compilers implementing this > yet, since this part of the spec kept changing. And we of course have a third method here at AT&T. The preprocessor distributed with 4th generation make uses: #define GLUE(a,b) a\+b Can you say non-compliant? I knew you could. Maybe X3J11 should consider BOTH latter methods, especially since I'm in a love/hate relationship with 4th generation make, and have GLUE(me,==nonportable) Rick Richardson, PC Research, Inc: (201) 922-1134 ..!ihnp4!castor!pcrat!rick when at AT&T-CPL: (201) 834-1378 ..!ihnp4!castor!polux!rer
guy%gorodish@Sun.COM (Guy Harris) (04/17/87)
>#define MACRO(first,last) (\ >first\ >_\ >last) > >Which is the correct expansion, or is it left to the cpp implementors? The ANSI C standard indicates that backslash-newline should be completely stripped from source code fairly early in the translation process. This means that inserting blanks is incorrect; however, it also indicates that substituting for "first" and "last" is incorrect, because this macro definition should be treated identically to #define MACRO(first,last) (first_last) >Is there a problem with the (our?) Sun version 3.2 cpp? There is a problem with the System V "cpp", from which the 3.2 "cpp" is derived. There is a technique more likely to work on various versions of UNIX: #define MACRO(first,last) (first/**/_/**/last) However, *this* is not guaranteed to work on all C implementations, either. Also note that *neither* technique will work with the version of the preprocessor used in many UNIX C implementations if you call the macro as MACRO(foo, bar) since the blank in front of "bar" is considered part of the argument. Both of these facts argue against widespread use of this technique, since it isn't guaranteed to work and since it breaks if you make changes to the source code that one would think safe. >What (if anything) does the new standard say about this? It says you should write the macro like: #define MACRO(first,last) (first##_##last) which will cause the "first", the "_", and the "last" to be glued together into one token. It also says (see the "debug" macro in the example on pages 80 and 81 of the October 1, 1986 draft) that blanks in the argument list should not be considered part of the argument. Of course, this is a draft standard, and is subject to change.
herndon@umn-cs.UUCP (Robert Herndon) (04/17/87)
In article <2857@linus.UUCP>, ms@security.uucp (Mistress Account) writes: > #define MACRO(first,last) (\ > first\ > _\ > last) > The intent of the developers was that when cpp expands the code, the two > macro arguments would be concatenated together into one token for the > compiler, i.e: MACRO(holy,cow) yields (holy_cow). Try: #define CONCAT(first,last) first/**/_/**/last I think this has worked for me on various suns -- the preprocessor does the expansion, stripping out the comments in the definition, leaving no spaces. Robert Herndon
jbuck@epimass.UUCP (04/18/87)
In article <5764@brl-smoke.ARPA>, gwyn@brl-smoke.UUCP writes: >> "token pasting". People using the Reiser CPP (found on most >> UNIX systems) typically resorted to the following trick: >> #define GLUE(a,b) a/**/b >> However, this is not guaranteed to work and, indeed, is >> guaranteed NOT to work in X3J11-compliant compilers. The >> X3J11 invention for token pasting is: >> #define GLUE(a,b) a ## b >> You will probably not find many compilers implementing this >> yet, since this part of the spec kept changing. In article <229@homxb.UUCP> gemini@homxb.UUCP (Rick Richardson) writes: >And we of course have a third method here at AT&T. The >preprocessor distributed with 4th generation make uses: > #define GLUE(a,b) a\+b >Can you say non-compliant? I knew you could. I use #define QUOTE(x) x #define GLUE(x,y) QUOTE(x)y Well, it works for me, and it's prettier than a/**/b. Can't tell whether ANSI breaks this: I suppose a conforming compiler could "tokenize" things before the preprocessor is invoked (or there may be no separate preprocessor pass) -- but I like it. -- - Joe Buck {hplabs,ihnp4,sun,ames}!oliveb!epimass!jbuck seismo!epiwrl!epimass!jbuck {pesnta,tymix,apple}!epimass!jbuck
gwyn@brl-smoke.ARPA (Doug Gwyn ) (04/18/87)
In article <1062@epimass.UUCP> jbuck@epimass.UUCP (Joe Buck) writes: >#define QUOTE(x) x >#define GLUE(x,y) QUOTE(x)y > >Well, it works for me, and it's prettier than a/**/b. Can't tell >whether ANSI breaks this... You're assuming that preprocessing is done at a character level rather than a token level, but in fact it is more natural to do it at a token level since that's how macro names and arguments are treated anyway. A tokenizing preprocessor would have no reason to glue together adjacent tokens; that's why X3J11 invented an explicit operator for specifying token pasting. There are a lot of tokenizing C preprocessors already in existence.. By the way, I object the the frequent use of "ANSI breaks this". X3J11 is trying to establish standard meanings for previously underspecified parts of the language that have caused portability problems in the past. ANY assumption you have been making that is not true of ALL C implementations will cause your code to "break" when it's moved to another environment that had different interpretations of the rules. This is not X3J11's doing! It is unavoidable that the eventual ANSI specified C environment will differ in some way from virtually all existing implementations, since they have all come up with mutually incompatible flavors of the language. The hope is that in the long run the ANSI C environment will be both powerful enough and flexible enough to be provided by almost all C implementors and used by almost all C programmers (perhaps as a subset of a wider, system-dependent environment such as POSIX).
henry@utzoo.UUCP (Henry Spencer) (04/19/87)
> Which is the correct expansion, or is it left to the cpp implementors? This is one of the (numerous) areas where K&R and such just were not quite explicit enough to provide solid guidance. The interface between cpp and the rest of the compiler is a real minefield, since historically cpp was a separate pass with very limited understanding of C syntax. > Is there a problem with the (our?) Sun version 3.2 cpp? No, its behavior is legitimate. The programmer who assumed that cpp would remove the backslashed newlines entirely and then combine things into one token was relying on an undocumented property of one particular cpp. There is *no* portable pre-ANSI way to get this effect. > What (if anything) does the new standard say about this? It may be kosher in X3J11, given their slightly-expanded view of the meaning of a backslashed newline, but I'd have to study the draft standard very carefully to be sure. I suspect that the backslashed newlines drop out at once, so the thing becomes one token *before* macro substitution, so it still doesn't work. They have defined a way to get the desired effect, but with different and more explicit syntax. -- "If you want PL/I, you know Henry Spencer @ U of Toronto Zoology where to find it." -- DMR {allegra,ihnp4,decvax,pyramid}!utzoo!henry
neville@ads.arpa (04/21/87)
There is a potentially non-portable preprocessor feature that most of the GLUE(a,b) macros that have been suggested suffer from. Most such macros that people use look like #define PASTE_IT(left,right) left/**/right What some people may not realize is that comments are sort of "doubly- defined" in that they are defined as part of the C language itself, but most C preprocessors go ahead and strip comments themselves. There is no reason that i can see to expect that all preprocessors will do this. If a C comment gets passed through to the *compiler*, what happens. You just can't count on hacks like this. -neville
neville@ads.arpa (04/21/87)
There is a potentially non-portable preprocessor feature that most of the GLUE(a,b) macros that have been suggested suffer from. Most such macros that people use look like #define PASTE_IT(left,right) left/**/right What some people may not realize is that comments are sort of "doubly- defined" in that they are defined as part of the C language itself, but most C preprocessors go ahead and strip comments themselves. There is no reason that i can see to expect that all preprocessors will do this. If a C comment gets passed through to the *compiler*, what happens. You just can't count on hacks like this. -neville