rowan@mozart.uucp (Steve Rowan) (10/19/88)
Up to this point I have been satisfied sitting on the sidelines listening to the debate on 8x. I cannot sit back and stay quiet when I find comments in the WG5 reports like the following: >An 8x lexer and parser was demonstrated for the full Fortran >8x language. The demonstration showed that the >implementation of full Fortran 8x can be done effectively >and in a timely manner. I realize that person was just reporting what happened at the WG 5 meeting but the statement is still wrong. Anyone that expects the lexer and parser for any language, even FORTRAN 8x or Ada, to take more than 5% of the development time does not understand the problem. Anyone who has taken an undergraduate compiler course should be able to create a lexer and parser for FORTRAN 8x in six months. I don't think I would hire someone right out of school if I didn't think they could do it in three months. In the early days of computing, when parsers were written by hand and code generators were naive, parsers were half the job. Those days ended in the late 60's however. Let me give some real world examples from our Ada complier. A much closer approximation to a FORTRAN 8x compiler than this example. Granted, an Ada compiler is simpler than a FORTRAN 8x compiler but this should provide some idea of the magnitude of the problem. The front-end of the CONVEX Ada compiler, the portion that performs lexing, parsing, and semantic analysis is 100,000 lines of C code. Roughly the size of our ENTIRE vectorizing/parallelizing FORTRAN compiler. The Ada parser and lexer come to less than 5,000 lines of the front-end total. Now lets add the vectorized and code generator. Let's say another 80,000 lines of C again taking the CONVEX Ada compiler as an example. Before it's said that we don't need a vectorized let me point out that one has to do exactly the same data dependency analysis to determine when the extra temps introduced by FORTRAN 8x's copy semantics can be removed. One also has the fuse the implied 8x array notation loops into larger loops so you don't pay the loop overhead so often, then you have to vectorize the merged loops. Remember we want a good FORTRAN 8x compiler for whatever the target machine may be. Now lets add say 20,000 lines of code for the library managers necessary for dependent compilation. Now we have a compilation system that is twice as big as the compiler Callahan, Dongarra and Levine found to be the best vectorizing FORTRAN compiler in the world (See "Vectorizing Compilers: A Test Suite and Results, Technical Memorandum No. 109, Mathematics and Computer Science Division, Argonne National Laboratory). To develop a full FORTRAN 8x compiler which will optimize and do parallelization (without vectorization) will take many person-years to complete. I think one can see how the numbers add up. One other Ada example and I will stop beating this dead horse. Ada lexers and parsers were available in 1979. The first production quality Ada compiler was not validated until 1985. I choose the DEC compiler as the first real Ada compiler. The academic Ada-ED compiler just does not count in my book. If I have missed something here I would be interested in hearing the counter argument. I have only been writing vectorizing compilers, FORTRAN, C and, Ada for the past ten years. I hope this did not come off as too much of a flame but as my public review comment stated, I feel we are doing the FORTRAN community a disservice by proposing a language with features the average FORTRAN user does not want or need. I also think WG5 is doing the community a disservice by proposing the next ALGOL 68, an elegant language that no one implements a compiler for, but that is their business. I assume I will get a change to voice my opinions about the ISO FORTRAN 88 draft early next year. Steve Rowan rowan@convex.COM
johnl@ima.ima.isc.com (John R. Levine) (10/19/88)
In article <656@convex.UUCP> rowan@convex.COM (Steve Rowan) writes: >Anyone that expects the lexer and parser for any language, even FORTRAN >8x or Ada, to take more than 5% of the development time does not >understand the problem. Anyone who has taken an undergraduate compiler >course should be able to create a lexer and parser for FORTRAN 8x in six >months. ... Six months? I wrote a lexer and parser for F77 last week, for some program analysis experiments I'm doing. Parsing Fortran is, if anything, easier than parsing other languages because once you've found the token boudaries, the syntax is very simple. Tokenizing is kludgy but by the application of some well-known (to Fortran compiler writers, at least) techniques, it's no big deal. F88 syntax is doubtless more complicated, but I doubt it's an order of magnitude more complicated. Based on some experience writing INfort, an early F77 compiler, I'd say that the mandatory data analysis (common, equivalence, adjustable dimensioned arrays) would take another week or two, a simple code generator a month or two, optimized code another year or two, and correct optimized code a year after that. Don't expect to see any working F88 compilers before 1991, even if they approve the standard tomorrow. -- John R. Levine, IECC, PO Box 349, Cambridge MA 02238-0349, +1 617 492 3869 { bbn | think | decvax | harvard | yale }!ima!johnl, Levine@YALE.something Rome fell, Babylon fell, Scarsdale will have its turn. -G. B. Shaw
link@stew.ssl.berkeley.edu (Richard Link) (10/20/88)
In article <656@convex.UUCP> rowan@convex.COM (Steve Rowan) writes: > >Granted, an Ada compiler is simpler than a FORTRAN 8x compiler Oh my God! Is this really true? The biggest gripe against Ada is it's size; in the same vein that an elephant is a mouse designed to mil specs. Dr. Richard Link Space Sciences Laboratory University of California, Berkeley link@ssl.berkeley.edu
rowan@mozart.uucp (Steve Rowan) (10/20/88)
In article <15739@agate.BERKELEY.EDU> link@stew.ssl.berkeley.edu (Richard Link) writes: >> >>Granted, an Ada compiler is simpler than a FORTRAN 8x compiler > >Oh my God! Is this really true? > >The biggest gripe against Ada is it's size; in the same vein that an >elephant is a mouse designed to mil specs. My assertion that an Ada compiler will be simpler that a FORTRAN 8x compiler is based on the following observations. 1. FORTRAN 8x contains every feature of Ada except tasking. 2. EVERY legal FORTRAN 77 program is a legal FORTRAN 8x program. That means one has to deal with insignificant blanks, interspersing DATA statements with executable code, no reserved words, and so forth. No one has proven to me that all the modern language features can be parsed in program units with statements like DO10I=1.5 IF(I)=5 DO10I=1,5 IF(I)5,5,6 3. Any production quality FORTRAN 8x compiler will have to accept all the standard extensions like REAL*4 and INCLUDE statements that X3J3 has chosen not to standardize. 4. FORTRAN 8x has features like the powerful array notation that do not exist in Ada. I could go on but I am sure X3J3 will address many of these issues when it responds to my public review comment and those of the 600+ other concerned FORTRAN users and vendors. Steve Rowan convex.COM
david@june.cs.washington.edu (David Callahan) (10/21/88)
The following is a reply to Steve Rown (rowan@convex.COM) who used the Convex Ada implementation to measure the difficulties in implementing Fortran 8x. I am neither an expert on Fortran 8X nor on Ada so please bare with my mistakes and questions. #Granted, an Ada compiler is simpler than a FORTRAN 8x compiler but this #should provide some idea of the magnitude of the problem. Would someone please motivate this statement? Ada requires some type-inference, generic packages, and handling of multi-tasking and execptions. What are the features (aside from array-valued expressions)in Fortran 8X that require similar amounts if implementation effort? #The front-end of the CONVEX Ada compiler, the portion that performs #lexing, parsing, and semantic analysis is 100,000 lines of C code. # ... The Ada parser and lexer come to less than 5,000 lines of the #front-end total. It does not follow that a Fortran 8X front end will have similar characteristics. #Now lets add the vectorized and code generator. Let's #say another 80,000 lines of C again taking the CONVEX Ada compiler as an #example. #Before it's said that we don't need a vectorized let me point #out that one has to do exactly the same data dependency analysis to #determine when the extra temps introduced by FORTRAN 8x's copy semantics #can be removed. (I don't want to argue against using vectorization technology for optimizing Fortran 8X, but the author implies that it is necessary for essentially any efficient implementation). Dependence analysis is only part of a vectorizer. In particular, to optimize array operations, you don't have to recognize induction variables and loop invariant expressions. You don't have to pattern-match for things like sum reductions and min/max reductions. Hence, the 80k figure may be very high. The necessity of removing the implied temporary is unclear since its lifetime is short. Of coarse, when adding one to every element of a gig-word array you don't want to copy it, but it seems reasonable that for a compiler to handle the easy cases well and the let the programmer help (eliminate copies by adding loops) when things get out of hand. Explicit array syntax is intended to improve the expressiveness of the language not allow programmers to be oblivous to the efficiency of processing huge amounts of data. #Now lets add say 20,000 lines of code for the library managers necessary #for dependent compilation. What are the aspects of Fortran8x that require more library management than "incude" files? The UNIX world learned how to handle "compilation dependencies" outside of the compiler. What about Fortran 8x invalidates those techniques?. # ... the compiler Callahan, Dongarra and Levine found to be #the best vectorizing FORTRAN compiler in the world (See "Vectorizing #Compilers: A Test Suite and Results, Technical Memorandum No. 109, #Mathematics and Computer Science Division, Argonne National Laboratory). Not to disagree with the statement, but this is not a conclusion the authors make in the cited tech report. #To develop a full FORTRAN 8x compiler which will optimize and do #parallelization (without vectorization) will take many person-years to #complete. I think one can see how the numbers add up. Why is parallelization necessary? You work for a company that markets a parallel computer and I work for one that intends to, but that is certainly not a requirement of most implementations of Fortran 8X. #One other Ada example and I will stop beating this dead horse. Ada #lexers and parsers were available in 1979. The first production quality #Ada compiler was not validated until 1985. I choose the DEC compiler as #the first real Ada compiler. The academic Ada-ED compiler just does not #count in my book. Ada was developed from scatch without existing experience with a subset and includes a very complex runtime environment to support multitasking and execption handling. Aside from recursion and dynamic allocation, does Fortran 8x have any feature that requies similar support? Won't any substantive change to the language (like user defined types, recursion, bit types, and dynamic memory allocation) require several years before compilers are available and reliable? Aside from the array syntax, what features can be deleted that will significantly reduce the development time? #If I have missed something here I would be interested in hearing the #counter argument. I have only been writing vectorizing compilers, #FORTRAN, C and, Ada for the past ten years. # #I hope this did not come off as too much of a flame but as my public #review comment stated, I feel we are doing the FORTRAN community a #disservice by proposing a language with features the average FORTRAN user #does not want or need. A couple of final questions in reply: is it necessary that the "average" Fortran user (if such a beast exists) want or need a feature for that feature to be added to the language? Shouldn't it be sufficient that some substantial portion of the community would find it useful, particularly given that Fortran 77 is a subset so that the only noticable change would be in compile speed? #Steve Rowan #rowan@convex.COM David Callahan Tera Computer Company (david@june.cs.washington.edu)