seibel@cgl.ucsf.edu (George Seibel) (11/05/88)
In article <2848@ima.ima.isc.com> johnl@ima.UUCP (John R. Levine) writes: >Actually, section 3.2.1 of the F77 standard says that "Comment lines may >appear between the initial line and its first continuation line or between >two continuation lines." It also says that a line which is blank in the >first 72 columns is a comment line, which is a minor pain for us compiler >writers. And in another article, ssd@sugar.uu.net (Scott Denham) writes: >just wake up one morning and decide it would be nice to add say, a new >loop construct (DO WHILE ?) to their compiler. These extensions cost the >developers a signifigant amount of money. [...] The common thread between these two articles is the reference to apparently simple fortran extensions/features as being either difficult or (equivalently) expensive to implement. I'm not a compiler writer, but for the life of me I can't see how some features like blank line == comment, or "!" == inline comment, or the implementation of a DO WHILE are all that tough! Wouldn't these all be performed in a modern compiler's front end, i.e. the "easy" part? Am I just being naive about the complexity of these problems or what? If some of you compiler writers out there would fill me in on this, I'd appreciate it. And while you're at it, could anyone tell me if there are any important compilers out there that actually insist on no more than six character identifiers? I'm thinking of allowing myself eight, but I rather much like portable code. George Seibel, UCSF
smryan@garth.UUCP (Steven Ryan) (11/07/88)
> The common thread between these two articles is the reference >compiler writer, but for the life of me I can't see how some features >like blank line == comment, or "!" == inline comment, or the Lexical analysis for Fortran tends to be mystical. Blank lines=comment is hard because you have to send poor soul who doesn't know better, preferably a new hire just out of school so that if you find his body strung up on some gallows feeding the crows you wouldn't have wasted an investment, anyway, who doesn't know better to go in and fight it out with the witches of no-it's-never-quite-just-an-fsm. >implementation of a DO WHILE are all that tough! Wouldn't these Adding new syntax tends to result in a multiplicative rather than additive increase in compiler size. Partly because of bad compilers, partly because Fortran is not (even close to being) orthogonal. -- -- s m ryan +--------------------------------------------------------+---------------------+ | `Lisa.' He swallowed hard. `Lisa take my hand.' | Enough of this | | She looked up into his eyes and slowly took his hand. | stuffy-stuff. | +--------------------------------------------------------+---------------------+
ssd@sugar.uu.net (Scott Denham) (11/07/88)
In article <11222@cgl.ucsf.EDU>, seibel@cgl.ucsf.edu (George Seibel) writes: > > And in another article, ssd@sugar.uu.net (Scott Denham) writes: > > >just wake up one morning and decide it would be nice to add say, a new > >loop construct (DO WHILE ?) to their compiler. These extensions cost the > >developers a signifigant amount of money. [...] > > ... I'm not a > compiler writer, but for the life of me I can't see how some features > like blank line == comment, or "!" == inline comment, or the > implementation of a DO WHILE are all that tough! Wouldn't these > George Seibel, UCSF I for one, George, have no idea how difficult these things are. I chose DO WHILE only as an example, and only because it happened to be floating by on some neuron at the exact moment my fingers reached that point in the sentence. A better feature to use would be one that doesn't exist but everyone would understand. One thing I have heard argued (and I've seen some pretty good examples) is that it's not the individual features that cause the compiler writers all the headaches, but the sheer number of possibilities. A couple of years ago somebody at IBM "cleaned up" the logic for handling implied DO loops in WRITE statements. I was astonished at how long it took to get the "cleaned up" code cleaned up, and at the obscure and strange things that popped out of the old "dusty decks" when this happened. Scott Denham Western Atlas International
steve@oakhill.UUCP (steve) (11/08/88)
In article <11222@cgl.ucsf.EDU>, seibel@cgl.ucsf.edu (George Seibel) writes: > In article <2848@ima.ima.isc.com> johnl@ima.UUCP (John R. Levine) writes: > to apparently simple fortran extensions/features as being either > difficult or (equivalently) expensive to implement. I'm not a > compiler writer, but for the life of me I can't see how some features > like blank line == comment, or "!" == inline comment, or the > implementation of a DO WHILE are all that tough! Wouldn't these > all be performed in a modern compiler's front end, i.e. the "easy" part? > Am I just being naive about the complexity of these problems or > what? If some of you compiler writers out there would fill me in on > this, I'd appreciate it. > And while you're at it, could anyone tell me if there are any > important compilers out there that actually insist on no more than > six character identifiers? I'm thinking of allowing myself eight, > but I rather much like portable code. > > George Seibel, UCSF Having written from scratch the total front end of a FORTRAN compiler, I think I can justify myself as experienced in the problem. The first problem - blank lines as comments, is a pain. The real pain is actually the archaic method of lexical analysis required in a FORTRAN compiler. Lexical analysis tends to be two parts, reading in a complete statement, and deciding what that statement is. Blank lines comments only effects the first of these (Second is by far the harder, but can be done quickly by some very nasty tricks). The real problem in the first part is deciding when you read the end of an actual line. Since you do not know wheather you have a contiuation without reading the first six characters of the next line and that comments can occur between contiuations, one must read all of the comments that follows a line in order to decide if you have finished the statement. Then you must store the first six characters of the next line if they are significant. In other words you can haveee almost infinite amount of lookahead before finishing reading a statement. The biggest headache actually is putting errors out in an appropriate place. If you have seen FORTRAN compiliers where the errors appear after the comments after the line in question, you now know why, and you know they did not want to bother with this question. Also be aware that blanks a pulled out of the FORTRAN (The varaibles A BAD MU and ABADMU are the same). End of line comments suffer the same problem. The second question - DO WHILE - is actually very easy. The compiler we wrote contained the DO WHILE. If you are doing a regular DO statement, DO WHILE is very easy, with very little overhead in the parser or in the sematic anaylsis (sematic analysis for DOWHILE and DO UNTIL is actually easier than that of DO). enough from this mooncalf - Steven ---------------------------------------------------------------------------- These opinions aren't necessarily Motorola's or Remora's - but I'd like to think we share some common views. ---------------------------------------------------------------------------- Steven R Weintraub cs.utexas.edu!oakhill!devsys!steve Motorola Inc. Austin, Texas (512) 440-3023 (office) (512) 453-6953 (home) ----------------------------------------------------------------------------
steve@oakhill.UUCP (steve) (11/08/88)
<1643@devsys.oakhill.UUCP> As usual I was far too hasty in replying. My handlers wish me to clarify something I just said that may have sounded wrong. In article <1643@devsys.oakhill.UUCP>, steve@oakhill.UUCP (steve) writes: > The second question - DO WHILE - is actually very easy. The compiler > we wrote contained the DO WHILE. If you are doing a regular DO statement, > DO WHILE is very easy, with very little overhead in the parser or > in the sematic anaylsis (sematic analysis for DOWHILE and DO UNTIL is actions actions > actually easier than that of DO). > Also I'm sorry for rambling in my first part. The idea I was trying to get across was that FORTRAN lexical anaylsis is not pretty (The people who write these routines are professionals. Children are advised not to try these same tricks at home. :-) ). And that adding any extra syntax to the lexical analysier just makes this harder and uglier. The compiler I just finished writing of course contained comments and blank line comments, but also was extended to have end of line comments, Debugging statements (D in column 1) and Assembly statements (A in column one). Needless to say, to get anything that ran decently fast, many ugly tricks were performed. enough from this mooncalf - Steven ---------------------------------------------------------------------------- These opinions aren't necessarily Motorola's or Remora's - but I'd like to think we share some common views. ---------------------------------------------------------------------------- Steven R Weintraub cs.utexas.edu!oakhill!devsys!steve Motorola Inc. Austin, Texas (512) 440-3023 (office) (512) 453-6953 (home) ----------------------------------------------------------------------------
PLS@cup.portal.com (Paul L Schauble) (11/09/88)
The tricks in a Fortran lexical analyzer are indeed interesting. What I'd like to know is: Are these documented anywhere? Surely with all of those compiler books out there, there must be one that goes into some of the nasty issues in Fortran. ++PLS
jejones@mcrware.UUCP (James Jones) (11/10/88)
In article <11112@cup.portal.com>, PLS@cup.portal.com (Paul L Schauble) writes: > The tricks in a Fortran lexical analyzer are indeed interesting. What I'd > like to know is: Are these documented anywhere? All compiler construction books I've seen give various pieces of FORTRAN syntax as prime horrible examples (arrays named EQUIVALENCE, GOTOI, or DIMENSIONA, variables named DO1000I, and many other other fun things). You basically have to be prepared to potentially look at almost the entire statement before you commit yourself to a parse (i.e. do *ANYTHING* that can't be undone). James Jones