johnl@esegue.segue.boston.ma.us (John R. Levine) (07/31/90)
In article <58372@lanl.gov> jlg@lanl.gov (Jim Giles) writes: >> One of the most important aspects of ALGOL is the grammar on which >> is was based. Context free grammars have since been almost universal >> for high level languages, ... >Context free means something different than you are using it for here. >What you are talking about (aparently) is free-form syntax (which is >a mixed bag - at least not all of the aspects of it as ALGOL defined >them are a good idea). Context free has to do with the formal specification >of the syntax - Fortran is context free (in fact, it's LR(k) - it _would_ >be LR(1) if blanks had been significant). Algol 60 was the first language to be defined with a formal syntax, specified in what has come to be called BNF. The syntax of Fortran was given informally in the manual. Parsing Fortran is quite context dependent (I know, because I've written some real Fortran parsers.) Since you have to ignore blanks, at least until F90 becomes widely accepted, the token boundaries depend entirely on the context, e.g. 10E5 is sometimes a floating point number, and sometimes not, as in these: DO 10E5 = 10E5 DO 10E5 = 1, 10 The first is an assignment to DO10E5, and the second is a DO loop using E5 as a variable name. If your Fortran allows long variable names, this is usually the first line of a function: REAL FUNCTION X(N) unless it is preceded by a PARAMETER statement that defines N, in which case it's a declaration of the array FUNCTIONX. Even if you made spaces significant and token boundaries were obvious, Fortran appears to me not to be LR(k) for any finite k since you can write this sort of thing: FORMAT(I1,I2,I3,I4,I5) = 3 and until you see the = sign, you don't know whether it's a statement function or a format statement. (A limit on K is provided by the maximum length of a statement, but that's a pretty cheesy out.) My goal here is not to dump on Fortran, its syntactic peculiarities are well known and entirely tractable, albeit not by conventional tools like lex and yacc without a tremendous amount of kludgery. The point is that formally specifying Algol was a big step forward, both since formal specification languages are the cornerstone of every modern compiler writing system, and because it is my observation that context free languages make it a lot harder accidentally to write a program in which small typos make it mean something entirely different from what it looks like it means. (Of course, just because Algol has a context-free easy-to-tokenize syntax doesn't mean that actual implementations all do. There was IBM's Algol F, which ignored blanks and required you to put quotes around your keywords, proving once again that Real programmers can write Fortran programs in any language. -- John R. Levine, Segue Software, POB 349, Cambridge MA 02238, +1 617 864 9650 johnl@esegue.segue.boston.ma.us, {ima|lotus|spdcc}!esegue!johnl Marlon Brando and Doris Day were born on the same day.
tom@stl.stc.co.uk (Tom Thomson) (07/31/90)
In article <1990Jul30.174035.26412@esegue.segue.boston.ma.us> johnl@esegue.segue.boston.ma.us (John R. Levine) writes: >(Of course, just because Algol has a context-free easy-to-tokenize syntax >doesn't mean that actual implementations all do. There was IBM's Algol F, >which ignored blanks and required you to put quotes around your keywords, >proving once again that Real programmers can write Fortran programs in any >language. What's been forgotten here is that ALL the early implementations of Algol had this problem, it's NOT an IBM Algol F problem. The Algol report used BOLD TYPE to distinguish keyword tokens from identifiers, so asking ffor quotes around keywords (or "@" symbols, or "_" symbols, or "%" symbols, as in some versions) was just asking the programmer to indicate which bits were supposed to be in bold type. Also, the only reason you can claim the syntax is context free is to choose to assert that programs that use undeclared identifiers are syntactically correct. Probably the best way of handling the syntax is to put hooks into the parser that adds identifiers to the appropriate (real, integer, boolean) class as they are declared and not recognise arbitrary identifier-like strings as identifiers as valid identifiers unless they have been declared. This results in a parser that behaves nothing like a context free one. Tom Thomson [tom @ nw.stl.stc.co.uk
boebert@sctc.com (Earl Boebert) (08/01/90)
tom@stl.stc.co.uk (Tom Thomson) writes: >In article <1990Jul30.174035.26412@esegue.segue.boston.ma.us> johnl@esegue.segue.boston.ma.us (John R. Levine) writes: >>(Of course, just because Algol has a context-free easy-to-tokenize syntax >>doesn't mean that actual implementations all do. There was IBM's Algol F, >>which ignored blanks and required you to put quotes around your keywords, >>proving once again that Real programmers can write Fortran programs in any >>language. >What's been forgotten here is that ALL the early implementations of Algol >had this problem, it's NOT an IBM Algol F problem. > [Stuff deleted ...] >Tom Thomson [tom @ nw.stl.stc.co.uk I don't recall GEIR Algol as having special delimeters for reserved words; will check my old class notes from Naur's course to be sure and report later ... Earl
boebert@sctc.com (Earl Boebert) (08/01/90)
boebert@sctc.com (Earl Boebert) writes: >tom@stl.stc.co.uk (Tom Thomson) writes: >>In article <1990Jul30.174035.26412@esegue.segue.boston.ma.us> johnl@esegue.segue.boston.ma.us (John R. Levine) writes: >>>(Of course, just because Algol has a context-free easy-to-tokenize syntax >>>doesn't mean that actual implementations all do. There was IBM's Algol F, >>>which ignored blanks and required you to put quotes around your keywords, >>>proving once again that Real programmers can write Fortran programs in any >>>language. >>What's been forgotten here is that ALL the early implementations of Algol >>had this problem, it's NOT an IBM Algol F problem. >> > >[Stuff deleted ...] >>Tom Thomson [tom @ nw.stl.stc.co.uk >I don't recall GEIR Algol as having special delimeters for reserved words; >will check my old class notes from Naur's course to be sure and report >later ... >Earl Well I misremembered, including the spelling of the machine's name ... GIER Algol *underlined* reserved words, an elegance made possible because I/O on the machine was through a Flexowriter. This also gave them upper and lower case; we 026-limited Yankees were pretty envious of that! For the historically minded, the project was started in January of 1962 and a running compiler was available in February of 1963. GIER was a Danish machine: 1024 42-bit words of core w/8.8 microseconds access and 12,800 words on drum with 20 milliseconds to transfer a 40-word track; rather a speedly little devil for its time. The compiler was, to say the least, a beautiful piece of work, and all us budding compiler-writers went out of our way to sit at Naur's feet and hear how he did it.
kc@oz.rci.dk (Knud Christensen) (08/02/90)
boebert@sctc.com (Earl Boebert) writes: >tom@stl.stc.co.uk (Tom Thomson) writes: >>In article <1990Jul30.174035.26412@esegue.segue.boston.ma.us> johnl@esegue.segue.boston.ma.us (John R. Levine) writes: >>>(Of course, just because Algol has a context-free easy-to-tokenize syntax >>>doesn't mean that actual implementations all do. There was IBM's Algol F, >>>which ignored blanks and required you to put quotes around your keywords, >>>proving once again that Real programmers can write Fortran programs in any >>>language. >>What's been forgotten here is that ALL the early implementations of Algol >>had this problem, it's NOT an IBM Algol F problem. >> > >[Stuff deleted ...] >>Tom Thomson [tom @ nw.stl.stc.co.uk >I don't recall GEIR Algol as having special delimeters for reserved words; >will check my old class notes from Naur's course to be sure and report >later ... >Earl GIER Algol used underscored keywords to distinguish them from identifiers. This was possible because the GIER computer used the flexowriter character set (the underscore character was non spacing). Knud Christensen kc@oz.rci.dk Inside every large program is a small program struggling to get out. - Hoare -