cline@suntan.ece.clarkson.edu (Marshall Cline) (07/01/89)
In article <1989Jun27.164758.1379@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes: >In article <2029@dataio.Data-IO.COM> bright@dataio.Data-IO.COM (Walter Bright) writes: >> 1. Trigraph support significantly slows down the scanner, which is >> the most time-consuming part of a compiler. Trigraphs are useless, >> and so are left out of the Useful C mode. >It's not necessary for trigraphs to be in the scanner at all, provided the >implementation supports them *somehow* (a sed script is what I'd use) for >official conformance. Ah TriGraphs. Henry's comment about using "sed" is interesting. But is it true that trigraphs change the contents of strings literals??. Example: "Is this a trigraph --> ??." What is printed by: printf("Foo ??. bar ??; baz ??? barf ??$"); (I don't even know if the ".;?$" are valid endings for trigraphs, but you get the idea...) If these nasty little fellers are gonna chomp down on my existing C code and munge my string literals, I'd like to know about it! But the real point of me posting is: "sed" is _only_ appropriate if trigraphs are expanded _WHEREVER_ they appear (including inside strings, in char literals, etc, etc. Otherwise the regular expression support in sed isn't powerful enough to parse a Context Free Grammar such as the BNF _syntax_ for ANSI-C. Recall that parsing a CFG requires a Push-Down Automata, which is strictly more powerful than any Finite Automata. (_Semantic_ aspects such as whether variable names are declared and/or are of compatible types are issues which can't even be resolved by a PDA; they require at least a Context Sensitive Grammar, and probably a full Turing Machine). Marshall -- ________________________________________________________________ Marshall P. Cline ARPA: cline@sun.soe.clarkson.edu ECE Department UseNet: uunet!sun.soe.clarkson.edu!cline Clarkson University BitNet: BH0W@CLUTX Potsdam, NY 13676 AT&T: 315-268-6591
gwyn@smoke.BRL.MIL (Doug Gwyn) (07/01/89)
In article <CLINE.89Jun30154443@suntan.ece.clarkson.edu> cline@sun.soe.clarkson.edu (Marshall Cline) writes: >Ah TriGraphs. Henry's comment about using "sed" is interesting. But is it >true that trigraphs change the contents of strings literals??. If you think about what trigraphs are intended for, the answer is obvious. Yes, trigraph replacement is the first thing done after mapping the physical source file characters to the internal source character set. Note that the physical-to-internal mapping provides another opportunity for handling local character set problems, and indeed is where I recommend that it be done whenever possible.