gwyn@Brl-Vld.ARPA (VLD/VMB) (02/22/85)
I got my copy of the Feb. 1985 "preliminary draft proposed ANS" for C yesterday and quickly scanned it for interesting changes. It is really in pretty good shape now, and many of the points that have been debated in INFO-C have been dealt with. One new feature of general interest is the way strings are handled by the preprocessor; instead of quoting rules I'll just post the example from section C.8.2: #define debug(s, t) printf("x" # s "= %d, x" # t "= %s", \ x ## s, x ## t) debug(1, 2) results in ... printf("x1= %d, x2= %s", x1, x2) Space around the # and ## tokens in the macro definition is optional. Another interesting addition is trigraphs for characters not in the ISO 646 Invariant Code Set. Example: ??< for {. The <stdarg.h> [like <varargs.h>] macros have been changed, for the better. And there are many other improvements both to the language and to the specification document.
arnold@ucsfcgl.UUCP (Ken Arnold%CGL) (02/27/85)
In article <8436@brl-tgr.ARPA> gwyn@Brl-Vld.ARPA (VLD/VMB) writes: >One new feature of general interest is >the way strings are handled by the preprocessor; instead of quoting >rules I'll just post the example from section C.8.2: > > #define debug(s, t) printf("x" # s "= %d, x" # t "= %s", \ > x ## s, x ## t) > debug(1, 2) > >results in ... > > printf("x1= %d, x2= %s", x1, x2) Has no one on the committee yet gotten the point that changing the way this is done will require rewrites of a large amount of existing, working C code? Since the C preprocessor on 4.x bsd, System V, DECUS C, and many other systems have scanned strings in macros for parameter substitution since parameter macros were invented, dropping this feature B R E A K S E X I S T I N G C O D E. Lots of it. The standard committee should look at how C is used, and it is used with this feature in many places. Look, I'm sure we all think certain existing things in C are ugly. But how many peices of code would you like to re-write which, say, relied on the fact that 'case's fall through unless told otherse? As far as I can tell, the C language proper has not been broken on purpose this way. Why is it okay to break the preprocessor? -- DON'T BREAK WORKING CODE -- -- Ken Arnold ================================================================= Of COURSE we can implement your algorithm. We've got this Turing machine emulator...
gwyn@Brl-Vld.ARPA (VLD/VMB) (02/28/85)
People who are using undocumented behavior of the specific (Reiser) implementation of the C preprocessor are the only ones who will have a problem with a fully tokenized CPP. Their code is ALREADY broken, even though it appears to work on some systems. It should not be the purpose of the C standards committee to legitimatize existing illegal or nonportable use of C, but rather to define what usage IS portable. There are C preprocessors already in wide use (I once used one) that are built into the lexical analysis phase of the compiler. There is no good way to support this without defining CPP behavior in terms of tokens. The ugly kludges people have been using to date do not fit this model, and what we need is a SOLUTION. Asking everybody to code like people at Berkeley did is not a good solution!
henry@utzoo.UUCP (Henry Spencer) (03/03/85)
I fear your argument [suggesting that the undocumented features of the Reiser cpp should be standardized, because not doing so will break working code] is weak. If the register-assignment policy in the VAX C compiler ever changes seriously, a great deal of Berkeley code will break because of in-line assembler [retch]. Are you seriously suggesting that changes to the VAX compiler which affect register assignment should therefore be forbidden, and that every VAX C compiler from now until eternity should be required to do it the same (undocumented) way? Or worse, that every C compiler in the world should be required to interpret in-line VAX assembler the same way the VAX does, giving the same output? The fix for brain-damaged implementation-dependent code is to fix the code. -- Henry Spencer @ U of Toronto Zoology {allegra,ihnp4,linus,decvax}!utzoo!henry
minow@decvax.UUCP (Martin Minow) (03/04/85)
Gwyn@brl-tgr.arpa points out that some compilers perform the "cpp" functions as part of the general lexical analysis, and therefore need a tokenizing preprocessor. Thus the string replacement and concatenation techniques in the Reiser cpp don't work. Actually, they can be made to work fairly easily. Just consider the string as a series of tokens: "foo bar" becomes "foo" " " "bar" and do further processing on any token that looks like a formal parameter. Token concatenation using /**/ is just about as easy. On the other hand, the draft standard is probably more understandable and a bit easier to implement. Martin Minow decvax!minow
arnold@ucsfcgl.UUCP (Ken Arnold%CGL) (03/05/85)
In article <8768@brl-tgr.ARPA> gwyn@Brl-Vld.ARPA (VLD/VMB) writes: >People who are using undocumented behavior of the specific (Reiser) >implementation of the C preprocessor are the only ones who will have >a problem with a fully tokenized CPP. Their code is ALREADY broken, >even though it appears to work on some systems. It should not be the >purpose of the C standards committee to legitimatize existing illegal >or nonportable use of C, but rather to define what usage IS portable. Let us say we have a preprocessor command # define FOO(d,e) printf("%d\n", e) Now, there are two ways this can be handled on current implementations (1) FOO(x,10) becomes printf("%x\n", 10) (2) FOO(x,10) becomes printf("%d\n", 10) There are two questions to be asked. (A) which way is more common, and (B) which way is more useful. I will concede that, since there are ways that (B) can be answered without using method (1), it is not relevant to this question, so I will just deal with (A). Most people who have objected to style (1) have stated that it is undocumented and/or illegal because it is not in K&R, and therefore should not be codified by the standards committee. But this is, quite simply, preposterous. Consider the following equivalent statement: "enum"s are not in K&R. They must, therefore, be an illegal extension, and should not be codified by the standards committee. Now, who out there subscribes to that? I know several implementations of C which don't have enums, just as gwyn knows a preprocessor that doesn't use style (1) (the Reiser style). But why should that make a difference? MOST implementations use enums. MOST implementations use them identically. enum's came from Bell Labs, and have become de facto standard, and therefore SHOULD be codified by the committee. Now, reread that statement thusly: MOST implementations use style (1). MOST implementations use it identically. Reiser's preprocessor came from Bell Labs, and has become de facto standard, and therefore SHOULD be codified by the committee. If this substitution makes no sense to you, please tell me why. It seems quite logical to me. I have yet to see a reason for excluding the Resier style which does not exclude enums. -- Ken Arnold ================================================================= Of COURSE we can implement your algorithm. We've got this Turing machine emulator...
gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (03/06/85)
Ken Arnold appears to have missed my point, which I perhaps did not sufficiently emphasize: The ANSI C standard wants to support TOKENIZED preprocessing. The character-string based kludges possible with the Reiser CPP do not fit into this concept.
donn@utah-gr.UUCP (Donn Seeley) (03/07/85)
Henry, I fear your argument by analogy is weak. Did it ever occur to you that implementations of Berkeley Unix exist on many computers other than the VAX? (Sun, NBI, NSC, Tektronix, Pyramid, Convex, ...) The Reiser preprocessor undoubtedly followed most if not all of these ports with little alteration; the inline assembler code did not. (If inline assembler is protected by preprocessor conditional compilation directives, as it really should be, porting is no problem either, although the generic code will often run slower than customized assembler code. Register assignment conventions such as the ones Henry discusses have not perceptibly impeded the porting of Berkeley Unix, although I agree they are probably unnecessary.) The C preprocessor supplied with System V is also Reiser-based and appears to contain the same extensions (our sole System V machine is dead with disk problems but a quick study of the sources supports my conclusion). The Reiser preprocessor is part of a portable software environment that is used in many places; in fact it would not surprise me to hear that the vast majority of C programmers use the Reiser preprocessor, although I have no statistics on this. The Reiser preprocessor's string substitution behavior is clearly one of the most popular C extensions, and for that reason alone deserves careful consideration. I find it more than a little odd that it is given such short shrift by some participants in this discussion when rather more radical proposals such as case ranges are being casually bandied about. My own feeling is that the ANSI C standard is not the place to make C perfect; C (including its common extensions) has enough imperfections that it seems pointless to break existing code just to fix a tiny fraction of them. I suggest that people concentrate their corrective impulses on a new language, such as C++. Try to guess how I feel about Fortran 77, Donn Seeley University of Utah CS Dept donn@utah-cs.arpa 40 46' 6"N 111 50' 34"W (801) 581-5668 decvax!utah-cs!donn
henry@utzoo.UUCP (Henry Spencer) (03/07/85)
> Let us say we have a preprocessor command > > # define FOO(d,e) printf("%d\n", e) > > Now, there are two ways this can be handled on current implementations > > (1) FOO(x,10) becomes printf("%x\n", 10) > (2) FOO(x,10) becomes printf("%d\n", 10) > > MOST implementations use style (1). MOST implementations use it > identically.... Please cite your justification for your use of the word "MOST" in this connection. What you say was probably true five years ago. It isn't now. The majority of current C implementations probably are *not* derived from Bell code, and therefore do not incorporate its eccentricities. C is no longer the near-exclusive property of Unix users, and the Unix implementations are now (I think) in the minority. Not because there aren't a lot more Unix implementations now, but because there are a LOT more non-Unix implementations. MANY, perhaps most, implementations of C do NOT use the Reiser cpp. -- Henry Spencer @ U of Toronto Zoology {allegra,ihnp4,linus,decvax}!utzoo!henry
Purtill@MIT-MULTICS.ARPA (Mark Purtill) (03/08/85)
<fnord> >The ANSI C standard wants to support TOKENIZED preprocessing. >The character-string based kludges possible with the Reiser CPP >do not fit into this concept. I don't see what the problem is. You have to deal with strings specially anyway to handle \n, et. al., right? So, at the same time, divide it up into tokens and see if any match the macro parameters. Note that you only have to do this if the string is in a parametered macro definition, so the overhead shouldn't be too bad. Anyway, its not clear that the standard should not include something only because it might be hard to implement in a certain manner. Mark
arnold@ucsfcgl.UUCP (Ken Arnold%CGL) (03/10/85)
In article <8986@brl-tgr.ARPA> gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) writes: >The ANSI C standard wants to support TOKENIZED preprocessing. >The character-string based kludges possible with the Reiser CPP >do not fit into this concept. They can want anything they choose to, but C currently exists. My arguments are based, generally, on the precept "thou shalt not break existing code". I, too, WANT C to work in some ways differently than currently, but I'd fight anyone trying to put the changes I want into the standard. -- Ken Arnold ================================================================= Of COURSE we can implement your algorithm. We've got this Turing machine emulator...
henry@utzoo.UUCP (Henry Spencer) (03/10/85)
> >The ANSI C standard wants to support TOKENIZED preprocessing. > >The character-string based kludges possible with the Reiser CPP > >do not fit into this concept. > > I don't see what the problem is. You have to deal with strings > specially anyway to handle \n, et. al., right? So, at the same time, > divide it up into tokens and see if any match the macro parameters. > Note that you only have to do this if the string is in a parametered > macro definition, so the overhead shouldn't be too bad. The trouble is that these two operations (escape processing and macro handling) often take place at quite different times. I know of at least one compiler (admittedly, an obscure one) which doesn't really deal with string escapes until code-generation time. Others deal with them right away, as the string is being scanned, well before anyone knows that it's part of a macro. Nobody's claiming that the Reiser things are impossible, just that they are awkward to do, unclean in concept, undocumented, and not widely accepted in non-Unix implementations. > Anyway, its not clear that the standard should not include something > only because it might be hard to implement in a certain manner. <begin cynical, not-entirely-serious tone> Why not? That's the way C has developed so far. Why change a winning way? <end tone> To my mind, the "undocumented", "unclean in concept", and "not widely accepted in non-Unix implementations" parts are rather more significant. -- Henry Spencer @ U of Toronto Zoology {allegra,ihnp4,linus,decvax}!utzoo!henry
henry@utzoo.UUCP (Henry Spencer) (03/10/85)
> ... your argument by analogy is weak. Did it ever occur to > you that implementations of Berkeley Unix exist on many computers other > than the VAX? (Sun, NBI, NSC, Tektronix, Pyramid, Convex, ...) The > Reiser preprocessor undoubtedly followed most if not all of these ports > with little alteration; the inline assembler code did not... > ... Register assignment conventions such as the ones Henry > discusses have not perceptibly impeded the porting of Berkeley Unix, > although I agree they are probably unnecessary. This misses my point somewhat. My contention was that the existence of code that exploits obscure undocumented features ("obscure undocumented features are effectively bugs", remember) is not a valid reason for insisting that said features be cast in concrete, and that arguments to the contrary can be applied to VAX asm() just as easily as to Reiser- style preprocessing. In fact, your comments strengthen my argument: I strongly suspect that unprotected inline assembler is more common in 4BSD than use of Reiser features. If the former has not seriously impeded portability to machines where it doesn't work, why should the preservation of the latter be such a concern? > ... The Reiser > preprocessor is part of a portable software environment that is used in > many places; in fact it would not surprise me to hear that the vast > majority of C programmers use the Reiser preprocessor, although I have > no statistics on this. Statistics on this wouldn't really be relevant. The relevant question is, how many programmers *care* whether their preprocessor has the Reiser features or not? I suspect that the number is modest, especially since the Reiser features are undocumented. > The Reiser preprocessor's string substitution behavior is clearly one > of the most popular C extensions... "Clearly?" Surely it should be clear to me too, then. It's not. Certainly string substitution is not popular among the C programmers I know, partly because it is shunned as an unportable accident of a particular implementation. > and for that reason alone deserves > careful consideration. I find it more than a little odd that it is > given such short shrift by some participants in this discussion when > rather more radical proposals such as case ranges are being casually > bandied about. First, how do you know it wasn't given careful consideration? My impression is that the discussion about the Reiser features consumed a significant fraction of the language subcommittee's time for months. Second, there is a lot of casual bandying-about of silly features whenever a language is being standardized; one of the crosses that standards committees have to bear is that they have to shoot down all the whacko ideas that crawl out of the woodwork. Case ranges are by no means the worst of it. And by the way, case ranges have not survived the latest revision of the C draft (11 Feb 1985). > My own feeling is that the ANSI C standard is not the place to make C > perfect; C (including its common extensions) has enough imperfections > that it seems pointless to break existing code just to fix a tiny > fraction of them. I suggest that people concentrate their corrective > impulses on a new language, such as C++. This is actually a fairly good statement of one of the committee's goals. The question is not whether the Reiser features are imperfect, but whether they are part of C at all. They occur in only a few of the numerous C implementations, although admittedly one of those is very influential and has been widely ported. They do not appear in K&R, the closest thing we had to a written standard before. (In fact, a strict reading of K&R makes the Reiser features illegal.) Are they part of the language, or just an aberrant feature of one implementation that was copied by one or two others? This is most definitely the sort of question that is the business of a standards committee. I'm sure that at least half of the committee members wish, by now, that it wasn't. -- Henry Spencer @ U of Toronto Zoology {allegra,ihnp4,linus,decvax}!utzoo!henry
donn@utah-gr.UUCP (Donn Seeley) (03/15/85)
Re Henry Spencer's latest comments... Henry, I'm afraid your response misses my point somewhat. Programs that use Reiser preprocessor features will run under most major species of Unix; programs that use inline assembler are lucky to run anywhere. As for documentation, postings in this group this week have demonstrated that the string substitution feature was documented at AT&T/Bell. And of course there are legions of programmers who learned C by imitation and picked up features not in K&R that were used by more sophisticated (less scrupulous?) programmers. To insist that programmers, and more importantly, programs, somehow forget this is fighting the tide. I'm not particularly persuaded by your claims of innocence with regard to the string substitution feature. I have never used the feature either, but it's not my code I'm worried about, it's the other guy's. Pureness of heart will not save you from having to maintain 10 megabytes of source for some utility which YOU DIDN'T WRITE! I used to be arrogant too (and still am if I can get away with it) -- if I saw code that wasn't written adhering to my high standards of style, onto the heap it went. Perhaps it's my old age, but I've mellowed considerably and no longer look forward to months of beating someone else's code into shape and getting it to work right again once I've torn it apart and rebuilt it. My experience with attempting to maintain part of the Unix Fortran compiler has led me to take the position that, as well-intentioned and as thorough as the ANSI Fortran committee was, Fortran 77 is a disaster. The standard broke and continues to break programs. Writing a program that will run under all of the various existing compilers is a nightmare, because the committee failed to include some of the most common extensions to the language. The only program I have ever seen that even attempts to be truly portable is the XTAL suite from the University of Maryland, and I have great admiration for the implementors and I'm very happy that I didn't have to do what they did (which is write everything in RATFOR, supplying a complete RATFOR compiler with every distribution, and parameterizing absolutely everything with macros). Just this last week I got mail from someone who wants to be able to assign character strings to non-character variables. This extension is present in most Fortrans but is ruled out by the standard. This person has a huge set of utilities which would be extremely difficult to modify to suit full Fortran 77, and if I don't make the change, either (a) someone else will do it or (b) they will simply forget about porting their utility to Unix. What right have I to refuse? We may not personally like or use Reiser preprocessor extensions, but what right have we to break programs that use them? (Maybe I should rephrase that -- why should we who have never used or needed features like token replacement in strings dictate to those who do?) I'm afraid Henry has misread my comments about the discussion of the preprocessor -- I was referring to the net when I said that some participants were not treating the issue seriously, not to the committee. I did read Larry Rosler's comments on the committee proceedings and I thought they were very interesting. I'm very worried that the C standard is going to turn into something bizarre, despite Larry's best efforts. We have already heard Dennis Ritchie's comments on the issue of strong typing in C and (I may be mistaken, I don't have a copy of the current draft) it appears that they are being ignored. The C preprocessor extensions strike me as being just as ugly and bogus as the Reiser extensions, with the quibble that the Reiser extensions at least appear in a set of widely used C implementations. What in the world are we going to end up with? As long as we're considering case ranges, how about COMMON statments, Donn Seeley University of Utah CS Dept donn@utah-cs.arpa 40 46' 6"N 111 50' 34"W (801) 581-5668 decvax!utah-cs!donn PS -- I used to have Alan Watt's wonderful 'TTI C formatting standard' article pinned to my wall, but the ink faded out. Guess I'll have to print another copy... Maybe I'll have it framed this time.
barmar@mit-eddie.UUCP (Barry Margolin) (03/19/85)
In article <1380@utah-gr.UUCP> donn@utah-gr.UUCP (Donn Seeley) writes: >We may not personally like or use Reiser preprocessor extensions, but >what right have we to break programs that use them? (Maybe I should >rephrase that -- why should we who have never used or needed features >like token replacement in strings dictate to those who do?) On the other hand, what right do we have to break programs that AREN'T expecting these incompatibilities. The big problem with this controversial issue is that there is no way to standardize it such that it is compatible for everyone. Several posters have already included simple examples that do not use the in-string replacement feature and will break if compiled with this feature. I think it goes something like #define MACRO(d) printf("%d", d) I think that the standard committee made the right choice in their compromise; it provides the facility, but in an upward-compatible fashion. -- Barry Margolin ARPA: barmar@MIT-Multics UUCP: ..!genrad!mit-eddie!barmar
henry@utzoo.UUCP (Henry Spencer) (03/21/85)
I'll try to be brief, I'm really sick of this issue... > ... Programs > that use Reiser preprocessor features will run under most major species > of Unix... Of course, this has little to do (nowadays) with what fraction of C compilers they will run under. This is net.lang.c, not net.unix, remember? > As for documentation, postings in this group this week have > demonstrated that the string substitution feature was documented at > AT&T/Bell. But nowhere else. This is net.lang.c, not net.unix or att.lang.c... > I'm not particularly persuaded by your claims of innocence with regard > to the string substitution feature. I have never used the feature > either, but it's not my code I'm worried about, it's the other guy's. > Pureness of heart will not save you from having to maintain 10 > megabytes of source for some utility which YOU DIDN'T WRITE! I would sympathize with this more if the problem were hard. It's not; detection of Reiserisms, and conversion to the ANSI-draft primitives, can be purely mechanical. This is not like (say) multiple external definitions, where there is no simple mechanical way to convert. > We may not personally like or use Reiser preprocessor extensions, but > what right have we to break programs that use them? (Maybe I should > rephrase that -- why should we who have never used or needed features > like token replacement in strings dictate to those who do?) Speaking as someone who intends to implement this swill in a compiler, I think I do have a right to object to it. > The C preprocessor extensions strike me as being just as ugly and bogus > as the Reiser extensions, with the quibble that the Reiser extensions > at least appear in a set of widely used C implementations. What in the > world are we going to end up with? I agree about ugliness and bogusness, and my #1 preference would be to see token concatenation and stringizing eliminated completely, as a regrettable aberration of one particular implementation. Alas, it is not to be... At least the argument is merely over how to do something that's already (ugh) accepted as part of C; this isn't quite the same as gratuitous addition of new goodies, however well-intended. -- Henry Spencer @ U of Toronto Zoology {allegra,ihnp4,linus,decvax}!utzoo!henry
donn@utah-cs.UUCP (Donn Seeley) (03/26/85)
I agree with Henry that this issue is thoroughly beaten to death, but I am still so irritated and exasperated by his posting that I felt some small obligation to follow it up. Rather than repeat the points I made in my previous article (or amend the text in the response where I was quoted out of context) I will simply reprint an old article of Henry's and let the net speculate as to why Henry's attitude about supporting ugly extensions to C seems to differ so wildly from one article to the next. ------------------------------------------------------------------------ Date: 30 Jan 85 20:02:38 CST (Wed) From: cbosgd!ihnp4!utzoo!henry Subject: union initialization and paper proposals To: ihnp4!cbosgd!std-c I think the people responding to Larry Rosler's comments on union initialization are missing an important point. As far as I know, nobody contends that the first-member rule for union initialization is beautiful. The key point is: there is real live experience with this rule, with a real compiler and real customers. In other words, it is known to work in practice, not just theory. This is of considerable importance when something is to be enshrined in a standard. I'm not aware of comparable real-life use of any of the various other proposals. Careful contemplation is *not* the same thing. Standards committees have good reason for taking field-proven proposals much more seriously than untried ones. Avoiding subtle disasters is more important, for a standard, than maximizing beauty. This is not to say that I like first-member union initialization. Personally, I think it should have been "done right" (although I'm not sure just how to do that) or ignored. If I'd been a committee member, in the absence of field-proven "done right" solutions, I think I'd have voted for ignoring the whole issue. Oh well. Henry Spencer @ U of Toronto Zoology {allegra,ihnp4,linus,decvax}!utzoo!henry ------------------------------------------------------------------------ Need I say that I agree completely with Henry's reasoning in this article? I promise not to drag this out any further, Donn Seeley University of Utah CS Dept donn@utah-cs.arpa 40 46' 6"N 111 50' 34"W (801) 581-5668 decvax!utah-cs!donn
henry@utzoo.UUCP (Henry Spencer) (03/29/85)
> ... I will simply reprint an old article of Henry's > and let the net speculate as to why Henry's attitude about supporting > ugly extensions to C seems to differ so wildly from one article to the > next. > [There follows my article saying "first-member union initialization > is preferable to 'doing it right' because there is implementation > experience with first-member and none with any of the various 'doing > it right' schemes".] I find it curious that Donn should cite this as a contradiction, or a wild difference, from my articles saying "the Reiser preprocessor extensions are vile and unportable, and should not be considered part of C for ANSI purposes". There is no contradiction at all, although just why depends on how you resolve an ambiguity in Donn's comment: 1. "Henry supports first-member and not Reiser". If you read the article carefully, you will see that in the (alas) absence of field-proven done-right solutions to union initialization, my preference would have been to leave it alone, i.e. nonexistent. I.e., I don't support first-member, although I understand why ANSI chose it rather than a paper proposal. 2. "Henry opposes 'done right' union initialization because there is no experience, and supports ANSI preprocessor features despite lack of experience". The second part of that sentence is dead wrong, since I think the ANSI versions of token concatenation and in-string substitution only slightly less vile than the old Reiserisms, and think both should be stamped out like cockroaches. I have said so on the net. I support neither. Either way, no contradiction except in Donn's imagination. I apologize for bothering the net with this, but public slander requires public response. Further debate on this should go by private mail. -- Henry Spencer @ U of Toronto Zoology {allegra,ihnp4,linus,decvax}!utzoo!henry