johnl@ima.ima.isc.com (John R. Levine) (11/10/88)
In article <11112@cup.portal.com> PLS@cup.portal.com (Paul L Schauble) writes: >The tricks in a Fortran lexical analyzer are indeed interesting. What I'd >like to know is: Are these documented anywhere? Surely with all of those >compiler books out there, there must be one that goes into some of the nasty >issues in Fortran. I've never seen them written down in any detail, though they weren't that difficult to figure out once I had collected a bunch of test cases, e.g.: REAL*4HELLO this statement doesn't contain a hollerith string even though 4HELLO looks somewhat like one. I recently wrote a yacc based parser that parses an interesting subset of F77. It handles assignments, function, subroutine, call, type declaration, common, equivalence, if, goto, do, return, end and a few other statement types I forget. It correctly (I hope) identifies literal strings and distinguishes assignment statements from keyword statements. It handles all three kinds of comments. It seems to parse what it parses correctly, I've been passing routines from the SSP library through it. Extending it to all of F77 would be tedious but straightforward, mostly involving adding more entries to the yacc grammar and token tables, and adding more lexer states to handle things like the TO in ASSIGN statements and the zillion END= type keywords in I/O statements. All it does with the stuff it parses is to write out a token stream suitable for reading into Lisp for some optimization experiments I was doing. Error checking is vestigial. It's about 16K of C and yacc source. If anybody wants this to play with, send me a note and I'll send it to you. The only restrictions are that my copyright notice must be maintained, and redistribution for direct commercial advantage is forbidden without my permission. (How one could sell it I can't imagine, but you never know.) -- John R. Levine, IECC, PO Box 349, Cambridge MA 02238-0349, +1 617 492 3869 { bbn | spdcc | decvax | harvard | yale }!ima!johnl, Levine@YALE.something Disclaimer: This is not a disclaimer.