DMB@PSUVMA.BITNET (10/08/86)
Punctuation or seperators (; , .) have long been a part of programming language design. Anyone who has programmed for even the shortest amount of will realize that these little demons are responsible for a large amount of possible errors, thus the question is why have them at all? The obvious answer is not for the programmer, but for the compiler writer and his compiler. This is true for at least two reasons. One, the seperator (usually a parse list seperator (, or ;) signals the the parser that another item is following which should be included in the list (i.e. repeat while next symbol is a seperator) instead of having to repeat until you reach an end of list symbol. Second, error recovery is easily done for instance in Pascal's ; because on an error, the parser can go into panic mode until it sees a ; at which time it can assume that things are pretty stable and continue parsing again. The question is, are these or any other reasons valid enough to keep these gremlins in the programming language specifications. Maybe so, i assume not. What is it that we really want by using seperators 1) When a parse unit begins and ends 2) ??????? By a parse unit i mean such things as statements, blocks, declarations,etc. How do we know when a unit ends and the next begins? Until now we used the seperator. I consider Wirth's Modula-2 an attempt to take a step away from all this seperator nonsense. However, he choose not to do so, as the semicolon is still prevalently used. I propose we can remove the semicolon, without losing its effectiveness by insisting that statements (and in general parse units) be equally bracketed. By this I mean that every construct have some sort of end-construct at the end of the construct. In Modula-2, for instance, the for statement is for id := expr to expr <seqstmt> END. And similarly if's, case's repeats,whiles all have end "brackets". Thus any two of these compounds when butted up against each other, the beginning and end of each statement can be found even if one of the middle symbols is missing. (missing symbols meaning, the ending symbol of the first compound statement or the beginning symbol of the second compound statement). This is exactly what the semicolon was used for. Ah but there is a problem, the assignment statement! consider the statements a := b + c e := f + g Without a seperator, you don't know if the e is part of the first statement (missing operator) or part of the second statement, UNTIL you read the second :=. Thus I will require an end of assignment statement token to be introduced, (notice a weakness in my proposal here). But the assignment statement is the only case where this is needed. Is this arguement clear? Correct? Example program with proposed syntax. Program Dummy Const (v p x) = 5 Type thetype = array[0 5] of integer Var (x1 x2 x3 x4 x5):integer (y1 y2 y3 y4 y5):real Procedure proc1(var (x3 x2):integer) begin x1 := x3 + x2 END for x1 := 1 to x5 do x4 := 1 div x1 END while x4 < 10 do x4 := 1 + x4 end end end begin x3 = 0 repeat proc1(x3,x5) until x3 > x5 end end Don't try to analyse this program, it does nothing at best. Admittedly, this looks an awful lot like swapping ; for end, and that is probably so. But aren't end's alot more intuitive then ; ? Maybe not, maybe so. Which is better? That's up to you.. What do you think? As a side note does any language allow the mathematically normal syntax of IF (4 < x < 8) THEN blabla I don't think so, but why not? dave brosius dmb at psuvma.bitnet
quiroz@rochester.ARPA (Cesar Quiroz) (10/10/86)
Expires: Sender: Followup-To: In a very interesting article, dmb@psuvm.bitnet writes: >> Punctuation or seperators (; , .) have long been a part of programming >> language design. Anyone who has programmed for even the shortest amount of >> will realize that these little demons are responsible for a large amount of >> possible errors, thus the question is why have them at all? >> >> ... A proposal follows, removing separators in favor >> of fully-bracketed syntax ... >> The idea is certainly interesting. A few random ramblings before I forget I wanted to follow up: 1- Bravo! You are about to rediscover S-expressions (and suggest we abandon syntax as a problem?) 2- I don't think separators (or, in general, punctuation) are an unmitigated evil. As with case sensitivity, type equivalence and other areas, I think one can find people who intensely prefer one of the extremes over the other, as well as people who will be indifferent. As an example: Some languages "free" you from semicolons by expecting all your statements to end properly in a single line. The language processor supplies the delimiters needed. This causes on occasion the following interesting situation: x := y || z x := y || z x := y || z x := y || z The statements above are not equivalent. For the second, the system should recognize that the statement is incomplete and, after looking in the next line, will produce the same results as the first statement. The third and fourth actually might produce syntax errors, or worse, unintended behavior, because this way of fixing the "semicolon problem" is sensitive to indentation... I consider it aesthetically repugnant, but I am pretty sure there are quite reasonable people who will find it the "right" tradeoff. 3- BASICs used to require LET as an assignment initiator (the end was bracketed by a newline). That gave the whole language a proper nesting syntax (punctuation had uses, though). I seem to recall a similar structure for Cobol procedure division statements: <verb> <things>. But I bet there were enough irregularities to mask off this sane intention. And again, see my note 1. What you *really* want is Lisp... >> >> >> As a side note does any language allow the mathematically normal syntax of >> IF (4 < x < 8) THEN blabla >> ICON (Griswold&Griswold) allows it. You might (or might not) like to take a look at its syntax (it contains the ugly example I gave above) but is a generally sane language. I would count it along with 'awk' for writing text processing stuff under UNIX, but that alone doesn't do enough justice to the concepts offered in the language (the syntax, then again, ...) I don't quite remember if Cobol allows it (perhaps *some* Cobol?) Common Lisp extends some of the comparisons to take more than one argument, in whose case the predicate is true if the arguments form a monotonic sequence. Sort of: if (0<x<y<1) then... This question has recurred already. Perhaps somebody who followed the previous round could post a Definitive List of Languages Whose Comparisons Bind the Usual Way? -- Cesar Augusto Quiroz Gonzalez Department of Computer Science {allegra|seismo}!rochester!quiroz University of Rochester or Rochester, NY 14627 quiroz@ROCHESTER
hansen@mips.UUCP (10/10/86)
> As a side note does any language allow the mathematically normal syntax of > IF (4 < x < 8) THEN blabla > > I don't think so, but why not? > > dave brosius > dmb at psuvma.bitnet Yes. I have implemented a language where this is permitted. In fact, it will accept expressions of this kind of arbitrary length, such as (a < b < c <= d), which is equivalent to (a < b) & (b < c) & (c <= d). The only reason against implementing it is that it is harder to optimize the expression. For example, if it is known that (b < c) is true, the expression above reduces to (a < b) & (c <= d), which can only be represented clearly by explicitly inserting the AND operator. If the expression is limited to three terms, of course, this doesn't happen. The construct is mostly useful just for its expressive power. It's shorter to write (4 < (x+(y*z)) < 6) than to write (4 < (x+(y*z))) & ((x+(y*z)) < 6), though it is straightforward, using common subexpression optimization, to make either form generate equivalent code. -- Craig Hansen | "Evahthun' tastes MIPS Computer Systems | bettah when it ...decwrl!mips!hansen | sits on a RISC"
debray@megaron.UUCP (10/11/86)
> I propose we can remove the semicolon, without losing its effectiveness > by insisting that statements (and in general parse units) be equally > bracketed. By this I mean that every construct have some sort of > end-construct at the end of the construct. Try lisp. Personally, I find that the high density of parentheses gets in the way of readability. -- Saumya Debray University of Arizona, Tucson debray@arizona.edu {allegra, cmcl2, ihnp4, ucbvax}!arizona!debray
ragerj@nucsrl.UUCP (John Rager) (10/11/86)
>Punctuation or seperators (; , .) have long been a part of programming >language design. Anyone who has programmed for even the shortest amount of >will realize that these little demons are responsible for a large amount of >possible errors, thus the question is why have them at all? We had a long discussion about this topic quite recently. I move we not repeat it again. John Rager
bzs@bu-cs.BU.EDU (Barry Shein) (10/12/86)
This comes up fairly often, doesn't it? In teaching University courses I've observed that people find the 'separator' approach very confusing (I refer to our Intro Pascal course as "Semicolons 101") while those languages which use them as terminators (C, PL/I) people seem to find fairly intuitive (I've taught programming here at BU in Asm, PL/I, Pascal and C, as well as Survey of Programming Languages where I have taught a lot of others, fast!) I remember when people started discovering "free-format" languages (well, more or less, let's say I watched communities migrate from Fortran, Cobol and Assembler in the mid-late 70's), I think they preferred the free format approach even if they had to now understand semi-colons (usually, of course Cobol had both format restrictions and the '.' terminator.) Some statements need to be broken into several lines, either you will have to re-introduce the 'continuation' syntax a la Fortran (*some* improvement) or impose other convolutions on your language to make sure you can parse multi-line lines unambiguously (eg. not allowing assignment within an expression.) I personally think you are solving a non-problem tho I encourage you to experiment. It's languages which use punctuations as separators that are the real problem, not the punctuations. -Barry Shein, Boston University
ron@brl-sem.ARPA (Ron Natalie <ron>) (10/12/86)
Punctuation is a very natural part of any language computer or natural the reason nearly all computer languages use such syntactic convetions either explicit by ending the statements in semicolons periods or by enclosing them in parenthesis or implied by using end of card or line is because this is the way most printed natural languages work I have never seen any attempt to simplify English language by leaving out the punctuation it makes for easier understanding of the printed word for humans and punctuation in computer programs makes understanding easier for both the programmer and the program -Ron
singer@spar.UUCP (10/13/86)
The language BCPL uses semi-colon as the statement separator, but the compiler is documented as being very relaxed about them being missing; basically they are only required when there would otherwise be an obvious ambiguity. Partly it does this (I think) by treating end-of-line as a possible 'implied semi-colon', ignoring it if not appropriate, and taking it if appropriate. Multiple statements on one line still had to be separated, I think. It also has a notation to join two statements in a single block a := 4 <> b := 5 $( a :=4 ; b := 5 $) a legible syntax, nice loop constructs, and also allows (5 < a < 10). Obviously it has problems wrt typing, addressing, and independent compilation, all fixed in C; but C broke legibility, loop constructs, and much that was beautiful in the language in a terrible attempt to go for minimum typing/maximum illegibility (second only to APL, in my opinion).
kurt@fluke.UUCP (10/13/86)
Reasons for continuing to have delimiters in languages: 1. It makes error recovery easier/possible in the compiler. Don't whine and snivel and say "if the compiler-writers were any good they could do adequate error recovery without noise tokens like delimiters." That simply isn't true. What have you gained if you have a language that is a tiny bit quicker to type in but the compiler can only mark the first syntax error, then gives up? Making error recovery easier for the compiler improves its ability to find all your syntax errors in a single pass. 2. Delimiters reduce the ambiguities in a grammar. This permits the compiler to use more expressive syntactic forms. You can use more powerful forms if they can be separated and made distinct by delimiters. 3. Delimiters reduce the tendency for the compiler to accept incorrect sentences as correctly formed syntax. I recently worked on a compiler that allowed function invocations using a named argument notation. To invoke the function func (arg1, arg2, ... argN) You used the invocation func arg1 <exp1>, arg2 <exp2>, ... argN <expN> However, since expresions can also be function invocations you could have invocations like func1 arg11 func2 arg21 <exp21> To make a long story short, comments looked like ! any characters following the "!" up to the end of the line. And if you accidentally forgot the "!", the resulting sentence frequently was accepted as a bizarre function invocation. If the argument list had been delimited by "("...")", or the argument name and argument expression had been separated by ":=", this would have been avoided. Noise words in languages add valuable redundancy that aids the human reader, compiler, and compiler writer. I always think of the words spoken by C.A.R. Hoare in his 1983(?) Turing Lecture address, where he said approximately "Wouldn't it be wonderful if your Fairy Gotmother would wave her magic wand over your program and pronounce it correct, and all you had to do was type it in three times."
garry@batcomputer.TN.CORNELL.EDU (Garry Wiegand) (10/13/86)
In a recent article ron@brl-sem.ARPA (Ron Natalie <ron>) wrote: >Punctuation is a very natural part of any language computer or >natural the reason nearly all computer languages use such syntactic >convetions either explicit by ending the statements in semicolons >periods or by enclosing them in parenthesis or implied by using end >of card or line is because this is the way most printed natural >languages work I have never seen any attempt to simplify English >language by leaving out the punctuation it makes for easier understanding >of the printed word for humans and punctuation in computer programs makes >understanding easier for both the programmer and the program Punctuation is *not* necessarily "very natural". The addition of punctuation to written English is historically recent, is it not? (My mental archives are whispering "16th century" at me.) Anybody know the facts? garry wiegand (garry%cadif-oak@cu-arpa.cs.cornell.edu)
ludemann@ubc-cs.UUCP (10/14/86)
You anti-punctuators might consider looking at the BCPL book (Strachey et al if I remember correctly). BCPL allows leaving out semicolons before end-of-line. If the last item on a line is an operator, then the statement is assumed to continue on the next line. Thus, a := b + c d := e a := b + c would be correct but a := b + c would be an error. So, the only use of semicolons would be something like: temp := a; a := b; b := temp; /* swap a and b */ (before anyone starts flaming about lvalues and rvalues, I know that this is NOT how it is written in BCPL) SASL (St. Andrews Static Language - Turner et al) has a method of avoiding punctuation by paying attention to indenting. I could look up the details if anyone is interested. Incidentally, a nice feature of BCPL is the ability to dynamically allocate a vector on the stack. (This feature is provided in 4.2bsd C by a function that is described as non-portable.) Also, the question was raised if any language allowed if a < b < c then ... and the answer is (of course) COBOL.
ken@rochester.ARPA (Comfy chair) (10/14/86)
One thing that is easily forgotten is that when humans look at a program listing we see indentation and spacing and all that. Unfortunately the compiler only sees a single stream. If you think that "any decent compiler should be able to error recovery in the absence of delimiters", try reading your program on ticker tape. And no backtracking, either. Ken
steiny@scc.UUCP (Don Steiny) (10/15/86)
In article <1206@batcomputer.TN.CORNELL.EDU>, garry@batcomputer.TN.CORNELL.EDU (Garry Wiegand) writes: > > Punctuation is *not* necessarily "very natural". The addition of > punctuation to written English is historically recent, is it not? No. > (My mental archives are whispering "16th century" at me.) Anybody > know the facts? I certainly know that Old English (before 1066) was puncuated as was Middle English (Chaucer - 14th century). Puncuation of some sort is natural because it seeks to represent the natural intonation of spoken language. All written natural language is is a symbolic representation of spoken natural language. Before there was even writing recorded events were puncuated in in a sense by being poems. George Miller's work seems to show that humans can remember a limited number of "chunks" of informations. Poems group the chunks by having rhyming units. Prose breaks the text into chunks by indicating sentence and phrase boundries with commas, periods, and so on. Maybe we should invent a programming language that deliniated statements with iambic pentameter. Or sonnetts. "when in disgrace with fortune and men's eyes, I, all alone, beweep my outcast state. . ." -- scc!steiny Don Steiny @ Don Steiny Software 109 Torrey Pine Terrace Santa Cruz, Calif. 95060 (408) 425-0382
DMB@PSUVMA.BITNET (10/15/86)
My point to all those who say delimeters aid in compiler writing is that fully bracketed statement syntax do just the same thing that seperators do. Plus they remove some ambiquities (dangling else as in Modula-2's if elsif elsif elsif else form) dave
tue@olamb.UUCP@ndmce.uucp (Tue Bertelsen) (10/18/86)
In article <7796DMB@PSUVMA>, DMB@PSUVMA.BITNET writes: > > Punctuation or seperators (; , .) have long been a part of programming > language design. Anyone who has programmed for even the shortest amount of > will realize that these little demons are responsible for a large amount of > possible errors, thus the question is why have them at all? > > The obvious answer is not for the programmer, but for the compiler writer > and his compiler. This is true for at least two reasons. Once upon a time there was a programming language called PLZ/SYS. It was developed by Zilog for their Z80 processor (and became later available under UNIX on the Zilog System 8000 computers). Besides being an extremely efficient language for microprocessor programming, it contained virtues that we still miss to see in so-called 'high-level' languages: - true modular programming - true structured programming - a concise syntax eliminating the need for punctuation - a concise sematics which was logical and readable - machine independence - strong typing - no gotos Programs in PLZ/SYS contained only declarations: - declarations of data - declarations of actions to be performed on data (i.e. procedures) The actual writing of programs required no punctuation, except that each token had to be separated from other tokens by delimiters. Delimiters could e anything (space, comma, semicolon, linefeeds, comments). This meant that programs could be written very readable (no inconsistent use of END), indented and spread across several lines. This made it actual possible to write programs faster and more error-free in the first run. Based on this experience, I consider a language design being dependent on punctuation a shame and an unnecessary constraint imposed on the programmer just for the purpose of easing the compiler writer's work. Still, there are old programmers, who cannot live without punctuation. In PLZ/SYS they were free to use the if they liked - just for the purpose of improving readability. So hopefully, next generation of languages will be free of such useless things. To conclude, here is the 3 rules of programming: We DO NOT write programs in high level languages in order to instruct the computer (nor the compiler!) (iff we did, we would be using hexadecimal instruction codes) We write programs in order to enable OTHER people to read them. Other people read programs in order to UNDERSTAND what the computers does, when it executes the program. In other words: write, so it can be read. For further information on PLZ: "Report on the programming language PLZ/SYS" Tod Snook, Charlie Bass et al. Springer-Verlag 1979 ISBN 3-540-90374-7 Sincerely yours, Tue Bertelsen AmbraSoft A/S
brad@looking.UUCP (Brad Templeton) (10/19/86)
The real question to be answered here is, "How are programs going to be created and compiled in the future?" Now, with traditional technology, punctuation (like semicolons) is a tremendous aid to error recovery. In the future, more programming may be done with language based editors and incremental compilation systems. There are two basic ways of doing this (although they can be combined). The first is the template style. In this case, the punctuation isn't even typed in normally. Things like semicolons and braces are all provided by the system. The user edits only the real things in the syntax tree. This is fine, (superb for learners, in fact) but not all experienced users enjoy it. Thus the other method, where the user works with the editor like a regular text editor, but using incremental parsing techniques, it understands what is going on anyway. With this method, punctuation is very useful because it simplifies (and speeds up) the parsing. It also makes it more reliable. So you you decide there shall be no punctuation in the next programming language, you make things more difficult for those who wish LBEs. Since punctuation is for the parser (mostly) you have to examine future compiler methodologies to discuss it. -- Brad Templeton, Looking Glass Software Ltd. - Waterloo, Ontario 519/884-7473
faustus@ucbcad.BERKELEY.EDU (Wayne A. Christopher) (10/19/86)
I think that a lot of the "errors" people complain that punctuation causes are syntax errors -- in my view, fixing syntax errors is the easiest part of debugging, and if there is any way that a more insidious error can be made into a syntax error, that's great. Without delimiters, unless your language forces you to type a lot of extra stuff or program in an unnatural way, it's easy to write something that's syntactically valid but not what you meant... Sure, readability is fine, but you have to make sure that the computer is reading it the same way... Wayne
dmb@psuvm.bitnet.UUCP (10/20/86)
I disagree! I think the subtleties of a programming languages SHOULD be designed for the compiler writer and associated compiler in mind. A language description should be such that the compiler can report the best description of errors, report the most errors in a single compile session, and aid in optimization. My problem is that semicolons is not the way to do this. dave Apology: Sorry I touched such a hot topic to begin with, those growing tired can hurl abuse at me, but then again what else is there to talk about....
afd@k.cc.purdue.edu (Bob Hathaway) (10/21/86)
Indentation is an excellent way to delimit control structures. It takes advantage of the natural structure programmers impose on their programs, and does away with a lot of cumbersome puncuation. Ignoring this information is unnecessary and error-prone. Bob Hathaway
mark@ems.UUCP (Mark H. Colburn) (10/22/86)
In article <1555@k.cc.purdue.edu> afd@k.cc.purdue.edu.UUCP (Bob Hathaway) writes: > > > Indentation is an excellent way to delimit control structures. > It takes advantage of the natural structure programmers impose > on their programs, and does away with a lot of cumbersome puncuation. > Ignoring this information is unnecessary and error-prone. > > > Bob Hathaway I would tend to agree, except that I have seen so many different ways to indent the same piece of code. By imposing the indentation delimiting, programmers must all use the same indentation style (since it would have to be part of the grammar of the language. I would think that this might outrage more programmers then it placates. -- Mark H. Colburn UUCP: ihnp4!rosevax!ems!mark EMS/McGraw-Hill ATT: (612) 829-8200 9855 West 78th Street Eden Prairie, MN 55344
ken@rochester.ARPA (Comfy chair) (10/22/86)
References: Let's follow a more productive line of discussion (or argument :-)): In a previous article I said that punctuation was needed by parsers for various reasons. What I failed to mention and what some other people pointed out is that other things can be made the cues instead of punctuation, like indentation. All you have to do in conventional parser technology is to make these explicit tokens instead of whitespace to be ignored. In interactive systems, it may even be easier to do it with indentation. (I know of one language that uses indentation, actually.) Now I think this is an idea with promise but let's not jump into this blindly. I think a discussion about the ramifications is worthwhile. I can think up of several things: 1. What do you do about tabs? Are they equivalent to space to the next stop or what? What happens if you move to another environment with different tab stops? (One solution, use tabs only.) 2. What do you do when the user runs out of intermediate columns to add another level of nesting? Sort of like running out of numbers for lines in BASIC. (One solution, automatic reformatting.) Well, I think that gives you the idea. There are more good new ideas to be explored instead of flaming each other about the right way it used to be done, etc. Language design isn't dead yet. Let's have more ideas. Ken
lwall@sdcrdcf.UUCP (Larry Wall) (10/23/86)
I think different statements should be different colors. Or perhaps stand out from the screen differently in three dimensions. Instead of top-down programming we can have front-to-back, or some such. I think you ought to be able to hide one statement behind another. Maybe you could do conditionals that way. Hey, did I just invent 3 dimensional programming languages? Larry Wall {allegra,burdvax,cbosgd,hplabs,ihnp4,sdcsvax}!sdcrdcf!lwall
mouse@mcgill-vision.UUCP (der Mouse) (10/27/86)
In article <21590@rochester.ARPA>, ken@rochester.ARPA (Comfy chair) writes: > One thing that is easily forgotten is that when humans look at a > program listing we see indentation and spacing and all that. > Unfortunately the compiler only sees a single stream. *Current* compilers see only a stream of tokens. Compilers could be and probably have been written which used/use the identation to intuit intended nesting, assisting in such things as error recovery. > If you think that "any decent compiler should be able to error > recovery [sic] > in the absence of delimiters", try reading your program on ticker > tape. And no backtracking, either. Why no backtracking? A compiler usually has access to enough virtual memory to hold the entire source file available for random access. Even when this is impossible, the compiler can normally seek around in the file, which gives the same capability with a performance penalty. I am ignoring smaller machines (eg, PCs) here; but I think you would find that a backtracking compiler doesn't look at very much when it backtracks - just things like "what was the indentation at the beginning of this loop?" - and hence doesn't need to remember anything like the entire source file. der Mouse USA: {ihnp4,decvax,akgua,utzoo,etc}!utcsri!mcgill-vision!mouse think!mosart!mcgill-vision!mouse Europe: mcvax!decvax!utcsri!mcgill-vision!mouse ARPAnet: think!mosart!mcgill-vision!mouse@harvard.harvard.edu Aren't you glad you don't shave with Occam's Razor?
craig@comp.lancs.ac.uk (Craig Wylie) (10/31/86)
In article <21836@rochester.ARPA> ken@rochester.UUCP (Comfy chair) writes: > >.... In interactive systems, it may even be easier >to do it with indentation. (I know of one language that uses >indentation, actually.) Miranda uses indentation -- if you know of any other interactive languages using indentation let me know. Occam uses indentation but is compiled. The indentation scheme in Occam is forced on the user as two spaces per level while Miranda uses the offside rule (Peter Landin's idea I believe). One of the big problems with indentation as a structure notation is that of multi line statements. The suggestion of using a backslash at the end of a line to be continued just feels like a backward step. I know it is used in many UNIX utilities (ie make and sh) but that doesn't mean it is elegant :-). I can not help but feel that continuation symbols in a programming language should be left in Fortran. We should be looking for a way of automatically determining structure. It should be possible for a parser to determine if the next line is a continuation. A combination of indentation and whether the next input token would be legal as a continuation should be enough. Cetainly Syntax directed editors would make things much easier. Craig. -- UUCP: ...!seismo!mcvax!ukc!dcl-cs!craig| Post: University of Lancaster, DARPA: craig%lancs.comp@ucl-cs | Department of Computing, JANET: craig@uk.ac.lancs.comp | Bailrigg, Lancaster, UK. Phone: +44 524 65201 Ext. 4146 | LA1 4YR Project: Cosmos Distributed Operating Systems Research Group