mjv@objects.mv.com (Michael J. Vilot) (11/22/90)
Henry Spencer ``hoisted a storm warning'' about the dangers of standards committees who invent language features, particularly when they ignore the experience gained through ``prior art.'' I agree with him about the potential danger. However, as Bjarne pointed out, there is little evidence of it in the current membership of the X3J16 committee (OK, there is some, but they're in the minority ;-). I'd like to contribute a couple of thoughts. First, the example of the `noalias episode of X3J11 was indelibly impressed upon the members of X3J16 who attended the March meeting. The sentiment to avoid a repetition is high on the list of reasons I've heard cited for resisting gratuitous inventions. On the other hand, there seems to be sincere desire to do better than trigraphs as a way to satisfy the legitimate needs of national character sets. Second, we face a difficult situation when specifying the components of the C++ standard library. Most of us, as C++ users, have used AT&T's `cfront' or a derivative. That means that `streams' (1.2), `iostreams' (2.0), `complex' and (in some cases) `tasks' constitute the bulk of ``prior art'' in the area of a standard library for C++ (I consider InterViews, NIHCL, libg++, and others as libraries distinct from the standard -- which might be an interesting thread). The availability of templates and exceptions has a substantial impact on how I design libraries in C++. I would hope that the library portion of the C++ standard would make the best use of the language. Yet we have little ``prior art'' in libraries using these features -- particularly the I/O classes. On the other hand, we should not gratuitously invalidate the existing C++ code using streams. It's a difficult design challenge -- I hope you will contribute your thoughts. -- Mike Vilot, ObjectWare Inc, Nashua NH mjv@objects.mv.com (UUCP: ...!decvax!zinn!objects!mjv)
henry@zoo.toronto.edu (Henry Spencer) (11/24/90)
In article <1016@zinn.MV.COM> mjv@objects.mv.com (Michael J. Vilot) writes: >...there seems to be sincere desire to do better than trigraphs as a way to >satisfy the legitimate needs of national character sets. I would have hoped that X3J16 would not be re-hashing all the dumb ideas that X3J11 carefully considered and carefully rejected for good reasons. However, given that Bjarne was one of the handful of people pushing this specific dumb idea, I suppose I should have expected it... The right answer to national character sets is ISO Latin 1 or equivalent, not ridiculous contortions in language syntax that *every* compiler *everywhere* then has to be able to parse. Trigraphs were a mistake. Remind me to submit a proposal to X3J16 to change C++ so that it can be typed using only the intersection of a Model 26 keypunch and an ASR-33. >The availability of templates and exceptions has a substantial impact on how I >design libraries in C++. I would hope that the library portion of the C++ >standard would make the best use of the language. Yet we have little ``prior >art'' in libraries using these features ... Hmm. Now that is a sticky problem. I fear the obvious answer is to try to produce upward-compatible extensions, so that existing code works but newer code can take advantage of the new facilities. Awkward. See what you get when you start adding language features? :-) :-) :-) This sort of thing actually did come up a little bit in the C library, for example in the type of the parameter to ctime(). X3J11 opted not to mess with historical practice. But they weren't facing a problem anywhere near the size of this one. -- "I'm not sure it's possible | Henry Spencer at U of Toronto Zoology to explain how X works." | henry@zoo.toronto.edu utzoo!henry
domo@tsa.co.uk (Dominic Dunlop) (11/26/90)
In article <1990Nov23.211727.2802@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes: > In article <1016@zinn.MV.COM> mjv@objects.mv.com (Michael J. Vilot) writes: > >...there seems to be sincere desire to do better than trigraphs as a way to > >satisfy the legitimate needs of national character sets. > > I would have hoped that X3J16 would not be re-hashing all the dumb ideas > that X3J11 carefully considered and carefully rejected for good reasons. > However, given that Bjarne was one of the handful of people pushing this > specific dumb idea, I suppose I should have expected it... > > The right answer to national character sets is ISO Latin 1 or equivalent, > not ridiculous contortions in language syntax that *every* compiler > *everywhere* then has to be able to parse. Trigraphs were a mistake. Yes. Strange, isn't it, that the Danes are so in love with seven-bit character sets? We've seen it in C, we're seeing it in POSIX. Looks as though it's making trouble in C++. I'm sorry if that sounds like an ethnic slur, but, as Henry says, equipment which talks using an 8-bit character set such as ISO Latin 1 is an obvious (minimum) requirement for program development. Any software shop which shackles its staff to inadequate hardware deserves every bit of productivity that it fails to get out of them. Those who suggest that the standards community should spend its time on grandfathering in support for coded character sets already superseded because of their clear inadequacies stand in grave danger of fashioning standards for the past, not for the future. While there might have been a case for doing this with C, a language which, beacuse of the time of its development, had a number of dependencies on a particular seven-bit coded character set (ASCII), it seems to me to be counter-productive to expend much effort on providing support for variants of that old character set in a new language -- C++. I hope that was clear. Now let me muddy it a bit. While I can see no reason for the development of C++ software to be carried out on inadequate hardware, it may be that the resulting programs have to support an installed base of inadequate hardware. Such is life. I'm talking about cross-development tools running on new hardware, but churning out binary code for old. No reason why the problem shouldn't be solved that way. Hell, it's not a new idea. (Whereas trigraphs were -- and a poor one, at that.) How long has COBOL had an environment division? Not that anybody uses it much, I'll grant you. But then, like trigraphs, maybe it just seemed a good idea at the time... -- Dominic Dunlop
tom@ssd.csd.harris.com (Tom Horsley) (11/27/90)
domo> I'm sorry if that sounds like an ethnic slur, but, as Henry says,
domo> equipment which talks using an 8-bit character set such as ISO Latin 1
domo> is an obvious (minimum) requirement for program development. Any
domo> software shop which shackles its staff to inadequate hardware deserves
domo> every bit of productivity that it fails to get out of them.
I dunno... Maybe trigraphs were a good idea, they are not too hard to
implement in a compiler, but they are absolutely *miserable* to use. Maybe
the idea was to make using them hurt so much that people would upgrade their
obsolete systems? :-):-):-)
--
======================================================================
domain: tahorsley@csd.harris.com USMail: Tom Horsley
uucp: ...!uunet!hcx1!tahorsley 511 Kingbird Circle
Delray Beach, FL 33444
+==== Censorship is the only form of Obscenity ======================+
| (Wait, I forgot government tobacco subsidies...) |
+====================================================================+steve@taumet.com (Stephen Clamage) (12/02/90)
tom@ssd.csd.harris.com (Tom Horsley) writes: >Maybe trigraphs were a good idea, they are not too hard to >implement in a compiler, but they are absolutely *miserable* to use. Trigraphs are not all that easy to implement efficiently, either; they really do slow down the compiler. The scanner, which dominates compiler front-end time, requires 3-character lookahead, not to mention complicating the interpretation of end-of-line for finding the ends of macros. Our original straightforward implementation of trigraphs caused a 15% slowdown of the compiler front end. We spent quite a bit of time finding an efficient way to handle them, and reduced the overhead to about 5%. Please note this affects every program ever compiled, even ones which contain no trigraphs. -- Steve Clamage, TauMetric Corp, steve@taumet.com
tom@ssd.csd.harris.com (Tom Horsley) (12/02/90)
>>>>> Regarding Re: design by committee (was: templates and exceptions in g++?); steve@taumet.com (Stephen Clamage) adds: steve> Our original straightforward implementation of trigraphs steve> caused a 15% slowdown of the compiler front end. We spent quite a bit steve> of time finding an efficient way to handle them, and reduced the steve> overhead to about 5%. Please note this affects every program ever steve> compiled, even ones which contain no trigraphs. I don't want to sound too insulting here, but I would say you have a seriously flawed design. I worked on a ANSI C scanner as a sort of academic exercise while trying to fully understand the way the macro processor works, and my scanner has no additional overhead to speak of even if you do use trigraphs. The key to making this work fast is recognizing that you have to examine each character in the buffer to classify it as you go along anyway. I used a <ctype.h>-like array that marked "interesting" characters and embedded the check in a getc()-like macro. The macro normally returns the next character using inline code, but if an interesting character shows up it calls a subroutine to do additional processing. A '\0' character is interesting because I might have to re-fill the buffer, A '\\' character is interesting because it might be followed by a newline and both of them will have to be squeezed out (remember that a backslash followed by a newline has always been a special sequence you had to check for even before question-mark question-mark came along - the overhead for tri-graphs is no worse than this). With tri-graphs, '?' is now also an interesting character. Sticking an extra check for the ?? tri-graph sequence in the subroutine that is only invoked when an interesting character comes along does not cost that much extra (unless you have a LOT of question marks in your source code). The tricky part is making sure you go ahead and fill the buffer if you are within 4 characters of the end and handling the case of a line terminated by ??/ followed by a newline. When I do find something like a tri-graph or a \ newline, I squeeze them out and replace them with what really belongs there. The routine knows where the current token starts in the buffer, so it just shifts it right to take up the slack, then it returns the proper character and scanning continues normally. This allows me to handle the phases of translation which process tri-graphs and backslash newlines transparently in the GetNextCharacter macro while I am also busting up the source into tokens. I can also leave the tokens in the input buffer without wasting the time copying them around unless I have to do something like squeeze out a trigraph. -- ====================================================================== domain: tahorsley@csd.harris.com USMail: Tom Horsley uucp: ...!uunet!hcx1!tahorsley 511 Kingbird Circle Delray Beach, FL 33444 +==== Censorship is the only form of Obscenity ======================+ | (Wait, I forgot government tobacco subsidies...) | +====================================================================+