rbutterworth@watmath.waterloo.edu (Ray Butterworth) (06/07/88)
Let's consider the various combinations of compilers and terminals. Commonly, either of these can be US-ASCII, 7-bit French-ASCII (or some other national character set), or 8-bit IS0-ASCII. 1) if I am using a US-ASCII terminal, I have the full C source character set at my fingertips and all three types of compilers must accept these characters according to the way they appear on my screen. Thus, I have no need for trigraphs. 2) similarly, if I am using an ISO-ASCII terminal, the keyboard will contain the full C source character set, and all three types of compilers must accept these characters. Thus, I still have no need for trigraphs. 3) finally, if I am using a 7-bit French-ASCII terminal, the situation is a little more complicated. 3a) if the compiler only knows about US-ASCII I have a choice of entering "\" either as "??/" or as "cedilla-c". 3b) if the compiler uses ISO-ASCII, then again I must enter "\" either as "??/" or as "cecilla-c". 3c) and finally, if the compiler knows about French-ASCII, then I would think that I must enter "\" as "??/", since the compiler will treat "cedilla-c" as a real letter. But if I try to define static char language??(??) = "FranCais"; where the "C" is actually the cedilla-c character, then strange things will happen since the standard says that the character set must include the "\" character, and so the string will actually contain "Fran\ais", which is "Fran<beep>is". Thus again I still have the choice of entering "\" as either "??/" or as "cedilla-c". So, putting this all together, regardless of what the compiler's character set is, it is only the French-ASCII terminal that has any need of the trigraphs. Now, on such a terminal I cannot use the cedilla-c character as anything but a back-slash since all three types of compilers must interpret this as a back-slash, and not as a cedilla. So, the only case that needs trigraphs is the French-ASCII terminal, and such a terminal will have nine keys that I am better off not using since they appear to give me something that I don't really get. People using French have three choices. Use the trigraphs and avoid those 9 keys; use those 9 keys, remembering their special meanings and forget about trigraphs; or get a different terminal and forget about trigraphs. That reduces the cases that need trigraphs to those that have French-ASCII terminals and that also prefer to avoid using the national keys. From what I can gather, there are not many people still buying French-ASCII terminals and those that have such terminals seem to prefer using the funny characters to using the trigraphs. Consider that at the moment trigraphs don't even exist outside the minds of the X3J11 Committee, and decide how many people that now use the funny characters and are going to switch to using trigraphs. The number of people that would actually use trigraphs must be amazingly small. For what it is costing the Committee in time, the publishers in paper, the net in shipping articles denouncing trigraphs, and the readers in time to read these articles, I'm sure it would be cheaper if we all chipped in and bought new terminals for those few individuals and then completely dropped the concept of trigraphs from the Standard.
daveb@geac.UUCP (David Collier-Brown) (06/08/88)
In article <19345@watmath.waterloo.edu> rbutterworth@watmath.waterloo.edu (Ray Butterworth) writes: | Let's consider the various combinations of compilers and terminals. | Commonly, either of these can be US-ASCII, 7-bit French-ASCII (or | some other national character set), or 8-bit IS0-ASCII. [ case analysis elided] | That reduces the cases that need trigraphs to those that have | French-ASCII terminals and that also prefer to avoid using the | national keys. | Consider that at the moment trigraphs don't even exist outside | the minds of the X3J11 Committee, and decide how many people | that now use the funny characters and are going to switch to | using trigraphs. Ok, can someone quote the approximate reasoning behind the consideration of trigraphs? As Ray has made a good case against the problem's existance, I therefor wonder 1) if some "outside" body has dictated that the standard committee "solve" it[1] or 2) if the committee merely misestimated the significance of the problem. --dave c-b [1] Suggested without proof earlier in the discussion, source not recorded. -- David Collier-Brown. {mnetor yunexus utgpu}!geac!daveb Geac Computers Ltd., | "His Majesty made you a major 350 Steelcase Road, | because he believed you would Markham, Ontario. | know when not to obey his orders"
swarbric@tramp.Colorado.EDU (Frank Swarbrick) (06/09/88)
I'm curious, does the IS0-ASCII standard have foreign characters such as cedilla-c, the characters with umlauts, accents, etc.? I know that IBM-PC's have a way for you to get characters such as these (by using Alt and the numeric keypad), but Apples, Commodores, many terminals, etc. don't allow them at all. I think it would be great if all computers/terminals could generate these in some way or another, but I guess it's more than a little too late for that... s-set, anyone? Frank Swarbrick (and, yes, the net.cat) swarbric@tramp.Colorado.EDU ...!{ncar|nbires}!boulder!tramp!swarbric "...This spells out freedom, it means nothing to me, as long as there's a PMRC"
guido@cwi.nl (Guido van Rossum) (06/09/88)
In article <...> rbutterworth@watmath.waterloo.edu (Ray Butterworth) writes: >From what I can gather, there are not many people still buying >French-ASCII terminals and those that have such terminals seem >to prefer using the funny characters to using the trigraphs. >Consider that at the moment trigraphs don't even exist outside >the minds of the X3J11 Committee, and decide how many people >that now use the funny characters and are going to switch to >using trigraphs. Although I would love to see that Ray is right, there is one unproven premise here: "not many people are still buying French-ASCII terminals". Here at CWI in Holland we usually have to fight to get US style keyboards on our equipment instead of Dutch national keyboards. I have the feeling that this might be the same or worse in other European countries, perhaps more so than in Canada (an "international" standard requires agreement from more countries than Canada and the US :-). I can't believe that in France, for instance, with a large autonomous computer industry, many US style keyboards are sold. Especially since the number of keyboards used for data entry will always outnumber those used for programming (unless the software crisis really gets a hold of us :-), I'm not so sure US-ASCII keyboards will win. Would a company with lots of data typists and some programmers buy special keyboards for them? Those programmers will then have to get used to both keyboard styles used in their organization (if they are involved in any form of user support). A different solution of the problem would be a tendency for keyboards to comprise both national and US-ASCII characters, in an ISO-ASCII set. If this is the development Ray is referring to, I just hope he's right. Neither the VaxStations 2000 nor the Suns 3 we have here have anything but US-ASCII (and zillions of unused function keys). -- Guido van Rossum, Centre for Mathematics and Computer Science (CWI), Amsterdam guido@piring.cwi.nl or mcvax!piring!guido or guido%piring.cwi.nl@uunet.uu.net
alex@umbc3.UUCP (06/09/88)
In article <19345@watmath.waterloo.edu> rbutterworth@watmath.waterloo.edu (Ray Butterworth) writes: >Let's consider the various combinations of compilers and terminals. >Commonly, either of these can be US-ASCII, 7-bit French-ASCII (or >some other national character set), or 8-bit IS0-ASCII. I thought that tri-graphs were invented for IBM (EBSDIC) terminals, and that IBM deserved them. -- :alex. nerwin!alex@umbc3.umd.edu alex@umbc3.umd.edu
daveh@marob.MASA.COM (Dave Hammond) (06/13/88)
In article <19345@watmath.waterloo.edu> rbutterworth@watmath.waterloo.edu (Ray Butterworth) writes: >So, putting this all together, regardless of what the compiler's >character set is, it is only the French-ASCII terminal that has >any need of the trigraphs.... >..........I'm sure it would be cheaper if we all chipped in >and bought new terminals for those few individuals and then >completely dropped the concept of trigraphs from the Standard. I have been following the trigraphs discussion in comp.unix.wizards closely to try and determine what exactly a trigraph is and why it has caused so much discussion. Not being a wizard, I have no intention of interrupting an ongoing discussion just to say 'hey folks, what is this thing?'. If all this broohaha is over a method of representing non-standard characters on non-standard terminals in minority situations, then I (like rbutterworth) feel there has been far too much adoo over a trivial problem (not meaning to trivialize the French terminals, mind you). If there is a farther-reaching concept which I am not grasping, please e-mail a definition of trigraphs. Dave Hammond UUCP: ...!marob!daveh --------------------------------
daniels@teklds.TEK.COM (Scott Daniels) (06/17/88)
I was at the meeting in which trigraphs first came in, and the reason I voted for them is fairly simple. They were not presented as a means for people who actually wrote C source to deal with missing characters, but as a means for mechanical translators to pass un-encodable characters. The setup I imagine actually being used is: Programmers in ??land whose national character set uses { for the all-important qz ligature (and who write comments using this a lot) happen to have a graphic on the $ character which looks just fine as an open brace. The programmers code away in this format locally, and (having hacked their C compiler) everything works out. When they decide to port their code to another country, they can mechanically translate those chars to the proper trigraph, and thus (1) mail source code, and (2) rely on the destination to use their best guess for those characters. It was considered a great advantage by many that the trigraphs chosen were ugly: this meant that nobody would be tempted to write with them, they were only for mechanical translations (a sort of least-common- denominator format). Scott Daniels (I was only briefly on the committee, another startup died) -daniels@teklds.TEK.COM (or @teklds.UUCP)
scs@athena.mit.edu (Steve Summit) (06/17/88)
Here's what I don't understand about trigraphs in character strings (the only kind I'm worried about): of what possible utility are they? As I understand it, trigraphs let you utter characters, which you need in C, which your local terminal doesn't understand. However, the thing you usually do with strings is print them out (usually on your local terminal) so if your local terminal can't handle the character, why is it important to have a special way to encode it within a string? If I am overlooking some obvious or oft-discussed fact, or if I am repeating Ray Butterworth's argument, please respond by mail or not at all; the net has had about enough trigraph articles. Steve Summit scs@adam.pika.mit.edu
karl@haddock.ima.isc.com (Karl Heuer) (04/19/89)
In article <10159@socslgw.csl.sony.JUNET> diamond@ (Norman Diamond) writes: >In article <12629@haddock.ima.isc.com> karl@haddock (Karl Heuer) writes: >>`printf("??=")' will output `#', not `??='. > >Yes indeed, such conversions take place even in strings. I wonder how >such programs execute in environments that don't have a '#' character. The C constant expression ('#'), or its alternate spelling ('??='), even though it might not correspond to any printable glyph, must have a value distinct from any other character. Since the implementation is allowed to apply a fairly arbitrary mapping when writing to a text stream, it's entirely possible for this to be written to the device as a digraph `$=', for example, where `$' is any convenient unused value. (It needn't even be printable, though this would be convenient for a terminal device.) Given such a mapping, and its inverse on input streams, it will appear to any conforming C program *as if* the execution environment really did have a `#' character. In particular, an editor or compiler written in C would automatically do the right thing. This is why I consider trigraphs to be unnecessary: a transparent mapping is already guaranteed to exist. Alas, the feature was too deeply entrenched in the Draft by the time I realized this. Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
daniels@ogccse.ogc.edu (Scott David Daniels) (04/27/89)
One fact that seems not to have come out yet is that trigraphs as added by X3J11 were an existing scheme that was proposed for adoption, not a new design that simply seemed to be a good idea. I suspect that is part of the problem with the Danish proposal: the trigraphs we voted on were something that had been in use for a couple of years, whereas the Danish scheme sounded like "this would be an even better idea:...", very little evidence there would be no nasty surprises later. -Scott Daniels (short-term X3J11 member may moons ago
karl@haddock.ima.isc.com (Karl Heuer) (04/28/89)
In article <2469@ogccse.ogc.edu> daniels@ogccse.UUCP (Scott David Daniels) writes: >One fact that seems not to have come out yet is that trigraphs as added >by X3J11 were an existing scheme that was proposed for adoption, not a >new design that simply seemed to be a good idea. They were? I was under the impression that X3J11 had invented them. (In which case they made a good argument for why the X3J11 charter generally forbade such inventions.) >... whereas the Danish scheme sounded like "this would be an even better... What exactly was the Danish proposal, and in what ways is it alleged to be better than the pANS trigraphs? Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
minow@mountn.dec.com (Martin Minow) (04/28/89)
It was suggested that trigraphs were an established practice before ANSI added them to the Draft Standard language definition. Could someone post a reference to a widely-distributed compiler that supported trigraphs before 1984? As far as I know, neither pcc, Berkeley Unix, Decus C, Vax-C (VMS), or Think (Lightspeed) C for the Macintosh support/supported trigraphs. I had argued against them on comp.std.c and during all public comment periods (though I've never actually received a written reply directly from the committee). My argument is that ISO 646 is a dead standard, having been supplemented by ISO 8859 (Latin-x). In the first public review responses, a half-dozen writers, including at least one from Sweden and one from Canada, suggested removing trigraphs. The committee response -- in full -- was "The Committee discussed alternatives to trigraphs on a number of occasions, but always decided that they fill a need. C must support a wide variety of terminals and keyboards, many of which lack the full C character set." While I understand the issues and sympathize with the problems the USASCII- specific characters pose for implementors (I am bilingual Swedish-English and have worked as a programmer in Sweden), they pose unsolvable problems for implementors and are as necessary today as a modified C for upper-case only terminals was in 1978 (when the VT05 and ASR33 were still in wide use). Martin Minow minow%thundr.dec@decwrl.dec.com The above does not represent the position of Digital Equipment Corporation.
mcdonald@uxe.cso.uiuc.edu (04/28/89)
>One fact that seems not to have come out yet is that trigraphs as added >by X3J11 were an existing scheme that was proposed for adoption, not a >new design that simply seemed to be a good idea. I suspect that is part What do you mean by "existing scheme"? What significant compiler (i.e. one selling more than 10000 copies per year) implemented them? They seem to be probably the worst misfeature of ANSI C: one that actually breaks working code. I have some code, based on K&R C that uses the sequences ??(, ??), and ??! as delimiters in a text file format - they are used in jillions of string constants. Wouldn't trigraphs break such schemes? Doug McDonald
nevin1@ihlpb.ATT.COM (Liber) (04/29/89)
In article <2469@ogccse.ogc.edu> daniels@ogccse.UUCP (Scott David Daniels) writes: >One fact that seems not to have come out yet is that trigraphs as added >by X3J11 were an existing scheme that was proposed for adoption, not a >new design that simply seemed to be a good idea. Just wondering: where exactly had they been used for C before?? (Is this a trigraph sequence? ^^^ :-)) The Rationale implies that the Committee came up with this solution on their own. -- _ __ NEVIN ":-)" LIBER nevin1@ihlpb.ATT.COM (312) 979-4751 IH 4F-410 ' ) ) "I will not be pushed, filed, stamped, indexed, / / _ , __o ____ briefed, debriefed or numbered! My life is my own!" / (_</_\/ <__/ / <_ As far as I know, these are NOT the opinions of AT&T.
gwyn@smoke.BRL.MIL (Doug Gwyn) (04/29/89)
In article <12840@haddock.ima.isc.com> karl@haddock.ima.isc.com (Karl Heuer) writes: >I was under the impression that X3J11 had invented them. Yes, they were an X3J11 invention. Similar (but not identical) schemes have been used for a variety of purposes for a long time, of course. >which case they made a good argument for why the X3J11 charter generally >forbade such inventions.) I don't think they're too bad for their intended purpose of C source file transmission to sites having only limited character set support (ISO-646). They would certainly be awful to have to use while writing programs, but that's not the intention. I personally don't think the C Standard needed to address this particular issue at all, and certainly not when so much public confusion and criticism resulted. Since the introduction of trigraphs, there has been further ISO code set standardization that may have obviated the need for trigraphs, but in case there are limited code- set environments still in use somewhere trigraphs may yet be of use.
daveb@gonzo.UUCP (Dave Brower) (04/30/89)
In article <229900002@uxe.cso.uiuc.edu> mcdonald@uxe.cso.uiuc.edu writes: >What do you mean by "existing scheme"? What significant compiler >(i.e. one selling more than 10000 copies per year) Just to be pedantic, at the time XJ311 formed, there were (and are today) quite a number of "significant" compilers that ship in substantially smaller numbers. Like in the tens and hundreds. -dB -- "An expert is someone who's right 75% of the time" {sun,mtxinu,amdahl,hoptoad}!rtech!gonzo!daveb daveb@gonzo.uucp
kemnitz@mitisft.Convergent.COM (Gregory Kemnitz) (05/04/89)
I just started reading this newsgroup (and don't have access to a lot of the standards committee stuff at this time). What is a trigraph?? How is one used?? Greg Kemnitz