[comp.lang.fortran] Is DO WHILE difficult?

seibel@cgl.ucsf.edu (George Seibel) (11/05/88)

In article <2848@ima.ima.isc.com> johnl@ima.UUCP (John R. Levine) writes:

>Actually, section 3.2.1 of the F77 standard says that "Comment lines may
>appear between the initial line and its first continuation line or between
>two continuation lines."  It also says that a line which is blank in the
>first 72 columns is a comment line, which is a minor pain for us compiler
>writers.

And in another article, ssd@sugar.uu.net (Scott Denham) writes:

>just wake up one morning and decide it would be nice to add say, a new 
>loop construct (DO WHILE ?) to their compiler.  These extensions cost the
>developers a signifigant amount of money.  [...]

    The common thread between these two articles is the reference
to apparently simple fortran extensions/features as being either
difficult or (equivalently) expensive to implement.   I'm not a
compiler writer, but for the life of me I can't see how some features
like blank line == comment, or  "!" == inline comment, or the
implementation of a DO WHILE are all that tough!   Wouldn't these
all be performed in a modern compiler's front end, i.e. the "easy" part?
    Am I just being naive about the complexity of these problems or
what? If some of you compiler writers out there would fill me in on
this, I'd appreciate it.
    And while you're at it, could anyone tell me if there are any
important compilers out there that actually insist on no more than
six character identifiers?   I'm thinking of allowing myself eight,
but I rather much like portable code.

George Seibel, UCSF

smryan@garth.UUCP (Steven Ryan) (11/07/88)

>    The common thread between these two articles is the reference

>compiler writer, but for the life of me I can't see how some features
>like blank line == comment, or  "!" == inline comment, or the

Lexical analysis for Fortran tends to be mystical. Blank lines=comment is
hard because you have to send poor soul who doesn't know better, preferably
a new hire just out of school so that if you find his body strung up on some
gallows feeding the crows you wouldn't have wasted an investment, anyway,
who doesn't know better to go in and fight it out with the witches of
no-it's-never-quite-just-an-fsm.

>implementation of a DO WHILE are all that tough!   Wouldn't these

Adding new syntax tends to result in a multiplicative rather than additive
increase in compiler size. Partly because of bad compilers, partly because
Fortran is not (even close to being) orthogonal.
-- 
                                                   -- s m ryan
+--------------------------------------------------------+---------------------+
|  `Lisa.' He swallowed hard. `Lisa take my hand.'       |    Enough of this   |
|  She looked up into his eyes and slowly took his hand. |    stuffy-stuff.    |
+--------------------------------------------------------+---------------------+

ssd@sugar.uu.net (Scott Denham) (11/07/88)

In article <11222@cgl.ucsf.EDU>, seibel@cgl.ucsf.edu (George Seibel) writes:
> 
> And in another article, ssd@sugar.uu.net (Scott Denham) writes:
> 
> >just wake up one morning and decide it would be nice to add say, a new 
> >loop construct (DO WHILE ?) to their compiler.  These extensions cost the
> >developers a signifigant amount of money.  [...]
> 
>  ... I'm not a
> compiler writer, but for the life of me I can't see how some features
> like blank line == comment, or  "!" == inline comment, or the
> implementation of a DO WHILE are all that tough!   Wouldn't these
> George Seibel, UCSF

I for one, George, have no idea how difficult these things are. I chose
DO WHILE only as an example, and only because it happened to be floating
by on some neuron at the exact moment my fingers reached that point in
the sentence.  A better feature to use would be one that doesn't exist
but everyone would understand.  
One thing I have heard argued (and I've seen some pretty good examples)
is that it's not the individual features that cause the compiler writers
all the headaches, but the sheer number of possibilities. A couple of 
years ago somebody at IBM "cleaned up" the logic for handling implied
DO loops in WRITE statements.  I was astonished at how long it took to
get the "cleaned up" code cleaned up, and at the obscure and strange 
things that popped out of the old "dusty decks" when this happened.
 
   Scott Denham 
    Western Atlas International
     

steve@oakhill.UUCP (steve) (11/08/88)

In article <11222@cgl.ucsf.EDU>, seibel@cgl.ucsf.edu (George Seibel) writes:
> In article <2848@ima.ima.isc.com> johnl@ima.UUCP (John R. Levine) writes:
> to apparently simple fortran extensions/features as being either
> difficult or (equivalently) expensive to implement.   I'm not a
> compiler writer, but for the life of me I can't see how some features
> like blank line == comment, or  "!" == inline comment, or the
> implementation of a DO WHILE are all that tough!   Wouldn't these
> all be performed in a modern compiler's front end, i.e. the "easy" part?
>     Am I just being naive about the complexity of these problems or
> what? If some of you compiler writers out there would fill me in on
> this, I'd appreciate it.
>     And while you're at it, could anyone tell me if there are any
> important compilers out there that actually insist on no more than
> six character identifiers?   I'm thinking of allowing myself eight,
> but I rather much like portable code.
> 
> George Seibel, UCSF


Having written from scratch the total front end of a FORTRAN compiler,
I think I can justify myself as experienced in the problem.  

The first problem - blank lines as comments, is a pain.  The real pain
is actually the archaic method of lexical analysis required in a FORTRAN
compiler.   Lexical analysis tends to be two parts, reading in a complete
statement, and deciding what that statement is.  Blank lines comments only
effects the first of these (Second is by far the harder, but can be done
quickly by some very nasty tricks).  The real problem in the first part is
deciding when you read the end of an actual line.  Since you do not
know wheather you have a contiuation without reading the first six characters
of the next line and that comments can occur between contiuations, one
must read all of the comments that follows a line in order to decide
if you have finished the statement.  Then you must store the first six
characters of the next line if they are significant.  In other words
you can haveee almost infinite amount of lookahead before finishing
reading a statement.  The biggest headache actually is putting errors
out in an appropriate place.  If you have seen FORTRAN compiliers where
the errors appear after the comments after the line in question,
you now know why, and you know they did not want to bother with this
question.  Also be aware that blanks a pulled out of the FORTRAN
(The varaibles  A BAD MU and ABADMU are the same).  End of line comments
suffer the same problem.

The second question - DO WHILE - is actually very easy.  The compiler
we wrote contained the DO WHILE.  If you are doing a regular DO statement,
DO WHILE is very easy, with very little overhead in the parser or
in the sematic anaylsis (sematic analysis for DOWHILE and DO UNTIL is
actually easier than that of DO).

                   enough from this mooncalf - Steven
----------------------------------------------------------------------------
These opinions aren't necessarily Motorola's or Remora's - but I'd like to
think we share some common views.
----------------------------------------------------------------------------
Steven R Weintraub                        cs.utexas.edu!oakhill!devsys!steve
Motorola Inc.  Austin, Texas 
(512) 440-3023 (office) (512) 453-6953 (home)
----------------------------------------------------------------------------

steve@oakhill.UUCP (steve) (11/08/88)

<1643@devsys.oakhill.UUCP>

As usual I was far too hasty in replying.  My handlers wish me to clarify
something I just said that may have sounded wrong.

In article <1643@devsys.oakhill.UUCP>, steve@oakhill.UUCP (steve) writes:
> The second question - DO WHILE - is actually very easy.  The compiler
> we wrote contained the DO WHILE.  If you are doing a regular DO statement,
> DO WHILE is very easy, with very little overhead in the parser or
> in the sematic anaylsis (sematic analysis for DOWHILE and DO UNTIL is
                 actions           actions
> actually easier than that of DO).
> 

Also I'm sorry for rambling in my first part.  The idea I was trying to
get across was that FORTRAN lexical anaylsis is not pretty (The people
who write these routines are professionals.  Children are advised not
to try these same tricks at home. :-) ).  And that adding any extra syntax
to the lexical analysier just makes this harder and uglier.  The compiler
I just finished writing of course contained comments and blank line 
comments, but also was extended to have end of line comments, Debugging
statements (D in column 1) and Assembly statements (A in column one).
Needless to say, to get anything that ran decently fast, many ugly
tricks were performed.

                   enough from this mooncalf - Steven
----------------------------------------------------------------------------
These opinions aren't necessarily Motorola's or Remora's - but I'd like to
think we share some common views.
----------------------------------------------------------------------------
Steven R Weintraub                        cs.utexas.edu!oakhill!devsys!steve
Motorola Inc.  Austin, Texas 
(512) 440-3023 (office) (512) 453-6953 (home)
----------------------------------------------------------------------------

PLS@cup.portal.com (Paul L Schauble) (11/09/88)

The tricks in a Fortran lexical analyzer are indeed interesting. What I'd
like to know is: Are these documented anywhere? Surely with all of those
compiler books out there, there must be one that goes into some of the nasty
issues in Fortran.

  ++PLS

jejones@mcrware.UUCP (James Jones) (11/10/88)

In article <11112@cup.portal.com>, PLS@cup.portal.com (Paul L Schauble) writes:
> The tricks in a Fortran lexical analyzer are indeed interesting. What I'd
> like to know is: Are these documented anywhere?

All compiler construction books I've seen give various pieces of FORTRAN
syntax as prime horrible examples (arrays named EQUIVALENCE, GOTOI, or
DIMENSIONA, variables named DO1000I, and many other other fun things). 
You basically have to be prepared to potentially look at almost the
entire statement before you commit yourself to a parse (i.e. do
*ANYTHING* that can't be undone). 

	James Jones