acha@CS.CMU.EDU (Anurag Acharya) (01/23/91)
It seems that in focussing on abstract syntax, folks designing lazy functional programming languages like Miranda and Haskell have pretty much ignored concrete syntax. I have heard the comment - "who cares about concrete syntax ? parsing is a problem of the 60s and 70s." Personally, I think that this is a rather callous approach since it pays little or no attention to the usability of the language. Attitudes like this result in anachronisms like the "off-side rule" and white-space significance hanging around long after the rest of the world has gone on. Great deal of attention has been paid to ensuring that the semantics of these languages is rigorously specified and "clean". I would expect that a fraction of this rigor and cleanliness would carry over to the specification of syntax. .... anurag
jgk@osc.COM (Joe Keane) (01/24/91)
I think the people arguing for significant use of whitespace have good intentions, but they're a little misguided. Basically i think they're solving a problem that doesn't exist. In C the syntax of statements and blocks is so simple, i can state it in less than a line: a statement ends with `;', and `{' and `}' delimit blocks. I've been using C for a long time and i have a number of gripes about the language, but this is not one of them. The syntax is simple and easy to use. Because of the simple syntax, editing C code is easy. You can cut and paste arbitrary blocks, without having to adjust the indentation if you don't feel like it. A neat property is that you can hit M-q (fill-paragraph) in Emacs, and the code may become very hard to read but it still works fine. In English the rule is simple: sentences end in `.' and questions end in `?'. You could propose a scheme where periods are optional at the end of a line, and add some way to indicate that a sentence continues to the next line. But if someone did this, people would laugh at him. Why fix what isn't broken? Also there is the issue of readability. Some people claim that the form without a semicolon `looks better'. This is clearly a matter of taste, and it probably depends on whether you're used to reading pseudo-code or C code. I actually like the other form better. If you use a normal indenting style, it's true that the semicolon is redundant, but a little redundancy can make it easier to read. In FLs we have a number of schemes which make use of whitespace. But what do you gain from this? What it comes down to is that you can avoid typing a semicolon or braces in some cases. But these schemes are more complicated than those they replace. Some languages say that you can use a semicolon to separate statements, but that it's optional in certain cases depending on surrounding whitespace. By the time you have rules like this, the elegance has been lost, and you have to wonder if it wouldn't be easier to just always put the semicolon.
S.Clayman@cs.ucl.ac.uk (01/26/91)
I have just read about 20 messages concerning off-side rules, white space having meaning etc... Firstly I would like to know how many of the people doing the criticism have actually USED a language with off-side rules. Secondly, languages aren't necessarily designed so that the compiler writer's task is made easier. Progamming language design should consider the users of the language; ease of expressing abstract ideas should be more important than how many minutes Eric Compiler-Writer saved when doing the parser. The most important thing I want to say is I have NEVER had a problem with off-side rules and white space introducing bugs, but I have introduced bugs in C programs by having single statements and then adding another statement at the same indentation thinking thaey are the same block of code. Having both lines indented to the same place has caused the confusion. I have just written a 2000 line Miranda program and off-side and white space aided me in the expression of my ideas, helped avoid silly errors, and made writing bug free code easier. Indentation is used for local definitions; if the compiler complained about things being off-side I went straight to them, and easily saw what the problems were. Also, i have been teaching students Miranda, they easily grasp the concept of left hand-sides and right hand-sides of expressions with local definitions to the right of the =. These are 1st years, some of whom have never seen a computer before. They can write working programs within a day, and have done complex projects after 1 term (e.g. text formatters, symbolic differentiators, stock control systems) They are now learning C; whose layout and syntax is not so easily learnt, and has a large syntactic overhead for the expressivness. I cant imagine any of the writing similar projects after 1 term of C. I would like to add that the Haskell approach of a formal translation to a form with no off-side is a good idea, someone somewhere is to be commended for that. stuart
dww@informatik.uni-kiel.dbp.de (Debbie Weber-Wulff) (01/28/91)
Just my 2pf worth in the offside-rule/concrete-syntax discussion: Since concrete-syntax and parsing are "solved problems" :-) I am attempting to formally prove properties about a scanner and parser for a subset of OCCAM, a language using the offside rule. You would not believe the amount of effort needed to prove that the simple algorithm of counting spaces works properly! The problem is always in "going back", i.e. reducing the indentation level. One must prove that the algorithm doesn't go back past 0, etc. This property makes the algorithm "non-compositional" : you can take 2 properly indented pieces of code that are syntactically legal, and after concatenating them get an indented piece of code that is not legal. The continuation algorithm for occam is unpleasant : one may only use a new line after a binary operator (because we know that something else must be coming...) or after certain keywords, and the indentation can be *more* but not less that the previous line. Comments have to be indented at least as much as the *following* line, and what do you do with blank lines and tab characters and on and on. For the folks that feel this is context-free for a fixed k : have you ever written out the LR(80) tables for such a grammar? Theoretically, one can write a nice van Wijngaarden grammar (aka two-level grammar) that has an infinite number of members of the token class "level parenthesis", but unfortunately I have found no efficient way of transforming such a grammar into code except by LL methods. As others have said: why on earth would one introduce so much muck just to save a keystroke? Belonging to the Ann-Landers-School of if-you-turned-it-on-turn-it-off-if-you-opened-it-shut-it, I like the clarity of proper begins and ends. But then I am just a Lisp hacker which explains a lot. Debbie Weber-Wulff FU Berlin weberwu@fubinf.uucp
dll@ut-emx.uucp (Don Loflin) (01/29/91)
In article <4167@osc.COM> jgk@osc.COM (Joe Keane) writes: [In 'C' notation]: -------- { { I think the people arguing for significant use of whitespace have good intentions, but they're a little misguided; Basically i think they're solving a problem that doesn't exist; } { In C the syntax of statements and blocks is so simple, i can state it in less than a line: a statement ends with `;', and `{' and `}' delimit blocks; I've been using C for a long time and i have a number of gripes about the language, but this is not one of them; The syntax is simple and easy to use;} { Because of the simple syntax, editing C code is easy; You can cut and paste arbitrary blocks, without having to adjust the indentation if you don't feel like it; A neat property is that you can hit M-q (fill-paragraph) in Emacs, and the code may become very hard to read but it still works fine;} { In English the rule is simple: sentences end in `.' and questions end in `?'; You could propose a scheme where periods are optional at the end of a line, and add some way to indicate that a sentence continues to the next line; But if someone did this, people would laugh at him; Why fix what isn't broken?} { Also there is the issue of readability; Some people claim that the form without a semicolon `looks better'; This is clearly a matter of taste, and it probably depends on whether you're used to reading pseudo-code or C code; I actually like the other form better; If you use a normal indenting style, it's true that the semicolon is redundant, but a little redundancy can make it easier to read;} { In FLs we have a number of schemes which make use of whitespace; But what do you gain from this; What it comes down to is that you can avoid typing a semicolon or braces in some cases; But these schemes are more complicated than those they replace; Some languages say that you can use a semicolon to separate statements, but that it's optional in certain cases depending on surrounding whitespace; By the time you have rules like this, the elegance has been lost, and you have to wonder if it wouldn't be easier to just always put the semicolon;}} -------- You really think that's more readable? Well, to each his own...:-) OK, sure, I took out all the indentation, but then, I could do the same in C, right? And the fact is, because you CAN leave it out, many DO. Take a look at any of Lee Adams'(TAB) books, the C versions -- that code is unreadable. He might as well have left it in BASIC. Yes, English uses periods, and if C used them instead of semicolons, I'd be overjoyed. And look! English uses whitespace (!) to delimit blocks (paragraphs) instead of ugly {}'s. And can you imagine if sub-paragraphs were used in English (not just Legalese) to the extent they are in C or Lisp, and we had to use {}'s or ()'s to delimit them all? Yuck. I'd go learn something easier, like Mandarin(!). Basically, explicit statement delimitation is probably a good idea, but block delimitation should use whitespace. In off-side rule languages, you do run into the problem of indenting yourself off the page. You could avoid that, but still reap the benefits of the off-side rule by using it only for block delimitation. I also suggest a modification to the rule such that the amout of white space in the first line of a block determines the amount that delimits. I.E: the C code: if (test) { statement1; while (test) { statement2; statement3; statement4; } if !(test) { statement5; else { statement6; } } becomes: if test statement1. while test statement2. statement3. statement4. if !test statement5. else statement6. ---Don Loflin, dll@emx.utexas.edu
kinnersley@kuhub.cc.ukans.edu (Bill Kinnersley) (01/29/91)
For all of those who've been saying that programming languages such as Miranda and occam, where leading whitespace is used in place of begin..end are difficult to implement, not context free, beyond the capability of lex and yacc, a crime against nature, etc... You might want to ftp to watserv1.uwaterloo.edu, and pick up the occam compiler there which uses lex and yacc, and see for yourself how really *trivial* it is to implement this feature! -- --Bill Kinnersley
thorinn@diku.dk (Lars Henrik Mathiesen) (01/30/91)
dll@ut-emx.uucp (Don Loflin) writes: >Basically, explicit statement delimitation is probably a good idea, but >block delimitation should use whitespace. In off-side rule languages, >you do run into the problem of indenting yourself off the page. You >could avoid that, but still reap the benefits of the off-side rule by >using it only for block delimitation. I also suggest a modification >to the rule such that the amout of white space in the first line of a >block determines the amount that delimits. By definition (Landin's) a syntactic entity governed by the off-side rule extends up to a line less indented than its _first_token_. It gets a little tiresome to see repeated suggestions that it should be modified to do what it has always done. (There is a widespread _style_ in Miranda that starts off-side blocks without going to a new line. It looks nice in typeset examples....) Further, both Miranda and Haskell use the off-side rule for a very limited number of syntactic constructs. Haskell comes quite close to your block-only idea, but allows implicit ``statement'' delimiters by returning to the indentation of the first line in the ``block''. Miranda wants to use the off-side rule to separate the uses of "=" as definition and equality, so it's a little weird. (Guess where the three off-side blocks are in foo x y = 0, if e = 1, otherwise where e = x = y .) Lastly, occam is not a functional language, and it doesn't use Landin's off-side rule. Criticism should go to comp.sys.transputer, if anywhere. -- Lars Mathiesen, DIKU, U of Copenhagen, Denmark [uunet!]mcsun!diku!thorinn Institute of Datalogy -- we're scientists, not engineers. thorinn@diku.dk