norvell@utcsri.UUCP (Theodore Stevens Norvell) (04/28/88)
In article <3925@killer.UUCP> elg@killer.UUCP (Eric Green) writes: [By the way, are you the Eric Green who used to live in Halifax?] >I have used a language that uses indentation as its block denotation (Promal). >It gets ugly, folks. That, instead of the formalism objections, should be what >[...] Free-form languages simply offer too many advantages to go back to >fixed-format languages (COBOL!!! GARGH!). David Gast writes > It might also be possible, to use the Grammar Rule Notation, a la > Prolog to describe the indentation levels. I do not know (because > I have not really considered it). The GNR is more powerful than > a CFG. An extra argument to the grammar rules could indicate the > level of indentation. > The discussion about using indentation to indicate program structure is interesting to me as I have designed a language which does exactly this and written a compiler for it. My feeling is that the parser should read programs (approximately) the same way that people do. In particular the most important information used by people to understand the control structure of programs -- namely the indentation structure -- should not be simply thrown out by the compiler. Many of the objections raised by people turn out not to be serious. First, the lack of formal notation is not serious. David Gast is right, a simple Definite Clause Grammar captures it in a fairly readable way. Let me give an example. I am assuming that tokens are records with at least two fields, one to identify the kind of token and one with the column that the first character of the token appeared in. The end-of-file token always starts in column 0. A special nonterminal called lookahead(T) recognizes the empty string iff T is the next token in the input string. statement(Indentation) --> [token(name, Indentation)], [token( := , _)], expression. statement(Indentation) --> [token(while, Indentation)], expression, block(Indentation). statement(Indentation) --> [token(if, Indentation)], expression, block(Indentation), moreif(Indentation). moreif(Indentation) --> [token(else, Indentation)], block(Indentation). moreif(Indentation) --> [token(elsif, Indentation)], expression, block(Indentation), moreif(Indentation). moreif(Indentation) --> lookahead(token(_, Ind)), {Ind <= Indentation}. block(Indentation) --> statement(Ind), {Ind > Indentation}, block(Indentation). block(Indentation) --> lookahead(token(_, Ind)), {Ind <= Indentation}. program --> block(0), [token(end-of-file,0)]. This says (i) that each statement in the body of a while loop must be indented more than the word "while", (ii) that each statement in an alternative of an if statement must be indented more than the word "if", and (iii) that the words "else" and "elsif" must come directly beneath the corresponding "if". (The last restriction is stronger than need be, but solves the else attachment problem cleanly and provides some redundancy required for error-detection/recovery). It is not quite a "free form" language, but it allows any style at all which would be considered "readable" by most people. It allows statements to be split across lines; it doesn't need statement separators or terminators; it allows more than one statement per line. It is no burden to the lexer to produce the column number of each token, as it should do so anyway for accurate error reporting. It is easy to translate such a grammar into a recursive decent parser or an explicit-stack top-down parser. It is probably not hard to translate it into a table driven top-down parser (akin to LL(k)). It is likely that there are bottom-up techniques too, but I haven't thought about it. The point of all this, is that many of the objections to languages with significant indentation raised are due to misconceptions (misconceptions that are understandable, given that most people haven't used such languages or have used languages, like Promal, with lots of other semi-related restrictions.) There are still problems with such a language. One is readability: It requires a lot of indentation to make it easy, e.g., to match else's to if's. Another is writablity: you must be careful not to accidentally over-indent the lines which (you think) follow an if or while statement, or they won't. A third is error recovery: but that is always tricky. A solution to all of these is to make the language less free form by insisting that all statements in the same block start at the same indentation level. Finally, such specification and parsing techniques are useful even in a compiler (or lint) for a "conventional" free form language. They can be used to warn the programmer that his indentation structure does not match his syntactic structure and they can be used to guide error recovery. Theodore Norvell (poor earnest academician) BITNET,CSNET: norvell@csri.toronto.edu CDNNET: norvell@csri.toronto.cdn UUCP: {alegra,cornell,decvax,linus,utzoo}!utcsri!norvell
nevin1@ihlpf.ATT.COM (00704a-Liber) (05/04/88)
In article <5992@utcsri.UUCP> norvell@utcsri.UUCP (Theodore Stevens Norvell) writes: >The discussion about using indentation to indicate program structure is >interesting to me as I have designed a language which does exactly this >and written a compiler for it. My feeling is that the parser should >read programs (approximately) the same way that people do. In particular >the most important information used by people to understand the control >structure of programs -- namely the indentation structure -- should >not be simply thrown out by the compiler. This kind of stuff is better enforced by language-sensitive editors and pretty-printers, not compilers. I don't mind if there is some lint-type program that gives me warnings if I make an indentation mistake, but I don't want the compiler choking on it. It is a hard job to define what 'readability' is, and I really don't want to slow my compiler down by putting some unnecessary restrictions on indentation. And without language-sensitive editors and/or pretty-printers, it is very hard always getting the indentation right. -- _ __ NEVIN J. LIBER ..!ihnp4!ihlpf!nevin1 (312) 510-6194 ' ) ) "The secret compartment of my ring I fill / / _ , __o ____ with an Underdog super-energy pill." / (_</_\/ <__/ / <_ These are solely MY opinions, not AT&T's, blah blah blah
carroll@snail.CS.UIUC.EDU (05/05/88)
/* Written 6:44 pm May 3, 1988 by nevin1@ihlpf.ATT.COM in snail:comp.lang.misc */ /* ---------- "Re: Indentation for parsing (was Re" ---------- */ ( ... ) And without language-sensitive editors and/or pretty-printers, it is very hard always getting the indentation right. _ __ NEVIN J. LIBER ..!ihnp4!ihlpf!nevin1 (312) 510-6194 /* End of text from snail:comp.lang.misc */ Interesting. I have no problem getting my indentation correct all the time, and I use standard vi, which is not language sensitive. I wouldn't like an indentation driven compiler, however, because that would enforce someone else's style on me, and readability is in the eye of the reader. Alan M. Carroll amc@woodshop.cs.uiuc.edu carroll@s.cs.uiuc.edu Grad Student (TA) / U of Ill - Urbana ...{ihnp4,convex}!uiucdcs!woodshop!amc "Too many fools who don't think twice Too many ways to pay the price" - AP & EW
dg@lakart.UUCP (David Goodenough) (05/07/88)
From article <4625@ihlpf.ATT.COM>, by nevin1@ihlpf.ATT.COM (00704a-Liber): > In article <5992@utcsri.UUCP> norvell@utcsri.UUCP (Theodore Stevens Norvell) writes: >>The discussion about using indentation to indicate program structure is >>interesting to me as I have designed a language which does exactly this >>and written a compiler for it. ..... > > This kind of stuff is better enforced by language-sensitive editors and > pretty-printers, not compilers. ..... And then there's always OCCAM > ..... And without > language-sensitive editors and/or pretty-printers, it is very hard always > getting the indentation right. This last comment almost deserves a flame. What is needed to produce correctly indented source code with any editor (be it vi, VEDIT or whatever) is *_DISCIPLINE_*. In 1980 at university, I decided on a coding style: if (test) { statement; statement; for (xyz; abc; foo) statement; } And I stick to it militantly - 8 years worth of it. It is now so ingrained that I do it as a reflex. ALWAYS produces readable code, and I can do a 'egrep "[{}]" \!*' to look for lines that contain {s and }s, and spot mismatced {}s in a twinkle. This is one of my objections to the other style: if (test) { statement; ..... It is too easy to mismatch, as you have to hunt for the { I've got my asbestos suit ready for the inevitable flames :-) :-) :-) -- dg@lakart.UUCP - David Goodenough +---+ | +-+-+ ....... !harvard!adelie!cfisun!lakart!dg +-+-+ | +---+
nevin1@ihlpf.ATT.COM (00704a-Liber) (05/24/88)
In article <94@lakart.UUCP> dg@lakart.UUCP (David Goodenough) writes: >This last comment almost deserves a flame. What is needed to produce >correctly indented source code with any editor (be it vi, VEDIT or whatever) >is *_DISCIPLINE_*. [stuff deleted about Mr. Goodenough's coding style] I agree that, for *new*, *original* source code, discipline is adequate for producing correctly indented source code. However, much of my job entails maintaining old code where I was not the original programmer. It is in this code where a smart editor would be valuable. I would like to be able to look at another person's code with my indentation style, make the changes I need to make, and save it in such a way so that if someone else looks at the source it appears to them in their style. All too often unreadability can be attributed to too many people modifying the same pieces of code using different styles (although if a routine has to be changed many times it probably should be rewritten, but that is a different issue). It should be possible to separate things like indentation style from coding. Another thing: Suppose you have a little routine which you decide to put inside an 'if' clause. Usually this would require indenting each line manually. With a smart editor, this should happen automatically. -- _ __ NEVIN J. LIBER ..!ihnp4!ihlpf!nevin1 (312) 510-6194 ' ) ) "The secret compartment of my ring I fill / / _ , __o ____ with an Underdog super-energy pill." / (_</_\/ <__/ / <_ These are solely MY opinions, not AT&T's, blah blah blah