ok@quintus.UUCP (Richard A. O'Keefe) (03/06/88)
Chris Moss's recent posting on the syntax part of the BSI Prolog SubStandard deserves a detailed response, which I intend to provide in the coming week. For the record, here's a brief response: (1) about half of the programs I have ever written would not be accepted by a BSI-compatible reader. (2) they've made several changes to operators. Amongst other things, a postfix operator cannot be any other kind of operator as well, but another change they made means that this restriction is pointless. (3) The attempt to describe Prolog control structures as *syntax* is fundamentally misdirected. (4) The basic structure of the BSI approach to syntax has been to cut the Gordian Goose. That is, instead of regarding the (actually rather low) diversity of Prolog syntax as a an opportunity to be solved by making the language more powerful (e.g. having a table-driven tokeniser), it has been treated as a problem to be solved by inventing a new, more restricted, language.
ok@quintus.UUCP (Richard A. O'Keefe) (03/06/88)
This still isn't the detailed comments, but my claim in article <736@cresswell> that > (3) The attempt to describe Prolog control structures as *syntax* > is fundamentally misdirected. warrants an example. I am referring to document BSI/IST/5/15 ISO/IEC JTC1 SC22 WG17 "Draft for comment: Feb 1988". (By the way, I *do* wish the BSI crowd would send me these things so that they arrived at least a week in advance of the meetings, instead of a day or two afterwards. I have never yet received notice of a meeting in time to send written comments for the meeting to consider. I haven't much confidence that they'd be read if I did send them, but it'd be nice to be given the chance.) Here is the example. In Prolog as we know and love it: p :- q. and (p :- q). are both clauses, and they are identical as clauses. This is true even in BIM Prolog's rather original native syntax. (There, Bart Demoen, I mentioned BIM Prolog!) Well, in BSI Prolog, they are both clauses, but they are *different* clauses. By rule 2.2 of BSI/IST/5/15 (henceforth to be called "the grimoire"), p :- q. is a clause which means whatever sys(procedure,[func(p,[]),func(q,[])]) means. {What *does* it mean? Ah, that's another story. I don't know either. The grimoire doesn't say.} Rule 2.2, by the way, looks like this: 2.2 clause = term, ":-", body ; Constraint X X g Abstract sys(procedure,[Head,Body]) Head Body This means something like clause(X, sys(procedure,[Head,Body])) --> term(X, Head), [:-], body(g, Body). The grimoire does say a little bit about the "Abstract" line, but it leaves the "Constraint" line completely unexplained, and I'm sorry but I can't tell you what it means, because I don't know. Back to the point. p :- q. is a clause which stands for something which looks pretty much like a clause. Now for (p :- q). By rules 5.5 and 2.1, this is a clause, which means whatever sys(procedure,[sys(procedure,[func(p,[]),func(q,[])]),func(true,[])]) means. That is to say, (p :- q). is identical to the clause (p :- q) :- true. Well, that's a bit surprising. What about :-(p, q). which in Edinburgh Prolog is identical to the other two examples. Let's look at rule 5.4. 5.4 term = atom, (next bracket | sep bracket), termlist, ")" ; Constraint h Priority 0 Abstract func(Name,Args) Name Args Condition not prefix_op(Name) [I replaced 'close bracket' by '")"' to fit the rule on one line. That's what it means anyway.] There's a language change in here, rather a deplorable one, I think. Evidently, someone with power on the committee believes that the happy marriage between function symbols and their left parentheses should be split up. The first time I read this rule, I interpreted it as meaning that a function symbol could not be a prefix operator, which would make :-(p,q) illegal. It was a wee while before it dawned on me that while spaces inside Prolog are being attacked, spaces inside the standard are sacred! (The grimoire doesn't say so, but that's the only way of making sense of some other rules as well.) The rule actually says that function symbols can be divorced from their parentheses *unless* they are prefix operators, in which case parting asunder is *out*. The Edinburgh Prolog rule was very simple: there is *never* layout between a function symbol and a "(" there is *always* layout between a prefix operator and a "(" The new rule seems very confusing to me. Pointless too. Any rate, when I had corrected my misunderstanding, it became clear that rules 5.4 and 2.1 together mean that :-(p, q). means whatever sys(procedure,[ func(:-,[func(p,[]),func(q,[])]), func(true,[]) ]) means. So, three formerly identical versions of a clause p :- q. (p :- q). :-(p, q). all mean something different in BSI Prolog. [It is possible that a later stage of interpretation may make :-(p, q) and (p :- q) identical, but they would still be identical to (p:-q):-true, not to p:-q. ] This difference, which I am sure was unintentional, flows naturally from the attempt to treat Prolog control structures as syntax in the same sense that they are syntax in Procrustes (sorry, in Pascal). The grimoire makes it plain that A , B and A & B <--- BIG MISTAKE are both mapped to sys(and,[A,B]), but whether either of these is the same as ','(A,B) or '&'(A,B) is not said. If we take the "Abstract" lines seriously, it would appear that ','(A,B) and '&'(A,B) are distinct terms, neither of which is the same as (A,B) or (A&B). {Note that &(a,b), legal in Edinburgh Prolog, is NOT legal in BSI Prolog.} Oddly enough, if one takes the grimoire literally, the user CAN declare ',' and '&' as operators, and can use them in that form. However, ',' and '&' cannot possibly have the same precedence as "," or "&" in BSI Prolog, and it seems clear that (A ',' B) and (A '&' B) must be different terms. The grimoire makes it plain that A ; B and A | B <--- DEC-10 Prolog mistake retained in BSI Prolog are both mapped to sys(or,[A,B]), but whether either of these is the same as ';'(A,B) or '|'(A,B) is not said. If we take the "Abstract" lines seriously, it would appear that ';'(A,B) and '|'(A,B) are distinct terms, neither of which is the same as (A;B) or (A|B). {Note that ;(a,b), legal in Edinburgh Prolog, is NOT legal in BSI Prolog.} Oddly enough, if one takes the grimoire literally, the user CAN declare ';' and '|' as operators, and can use them in that form. However, ';' and '|' cannot possibly have the same precedence as ";" or "|" in BSI Prolog, and it seems clear that (A ';' B) and (A '|' B) must be different terms. Note that in ( a ';' b, c ) the grouping, if it is legal at all, MUST be ( (a ';' b) , c ) in BSI Prolog. What about the "then" operator "->"? Well, unlike ",", "&", ";", and "|", "->" is an atom. (In Edinburgh Prolog, "&" and ";" are both perfectly ordinary atoms, but we are talking about BSI Prolog, where they are punctuation marks.) But rule 3.4 is 3.4 condition = goals, "->", ( goals | condition ) ; Constraint X X Abstract If,Then If Then Then which matches up with 3.2 body = condition, ";", body ; Constrant X X X Abstract sys(ifthenelse,[If,Then,Else]) If,Then Else [I've replaced 'or symbol' by '";"', "|" is also allowed.] It seems quite clear from this that [a] ( p -> q ; r ) means sys(ifthenelse,[func(p,[]),func(q,[]),func(r,[])]) but that [b] ( p '->' q ; r ) [c] ( ->(p, q) ; r ) if legal, must mean sys(or,[func(->,[func(p,[]),func(q,[])]),func(r,[])]) Now, rule 3.4 may be overly precise. The intention may have been 3.4' condition = goals, atom, ( goals | condition ) ; Constraint X X Abstract If,Then If Name Then Then Condition Name=(->) which would make [a] and [b] identical, but leave [c] different. A similar intention may exist in rules 2.2 (:-), 2.3 (?-), 2.4 (:-), and 2.5 (-->) where the quoted items are also legal atoms. But that's *not* what the grimoire actually says! There's something strange about the treatment of "{" and "}" in BSI Prolog. According to the grimoire (rule 5.6), { X } means pred('{}',[X]) while '{}'(X) means func('{}',[X]) and { } means '{}' while '{}' means func('{}',[]) This may be an oversight, as may the fact that while ( p ; q ), ( p :- q ), and { p ; q } are all legal in BSI Prolog, { p :- q } is NOT. (It is legal in Edinburgh Prolog.) I'm quite sure that the fact that rules L7, L29, L30, and L15 imply that <)> is a valid token is a mistake, but I just thought I'd mention it. Another probable glitch is the fact that the only way to get a backslash character in a quoted atom is to write its octal code. (According to rule L12, \\ is NOT a legal escape sequence.) A complaint here: if character escaping is supposed to be C-like, is it too much to ask that it should *BE* C-like? The ANSI C escape sequences \a (audible alarm) \v (vertical tab) \\ \? are completely missing. The following clauses, valid in Edinburgh Prolog, are illegal in BSI Prolog. /** By L22, this is not a legal comment **/ compare(R, X, Y) :- ( X @> Y -> R = > ; X @< Y -> R = < ; R = = ). :- op(600, xfy, .). % yes, THIS is illegal. append(H.T, L, H.R) :- append(T, L, R). not Goal :- % "not" is not a built-in operator ( ground(Goal) -> \+ Goal % neither is "\+". ; signal_error(instantiation_fault(Goal,0)) ). [The fact that the grammar proper has no negation proper is no accident. The grammar in the grimoire defines only operators with at the level of comma and weaker. Negation binds more tightly than comma, so it would have to be in the table of built-in operators. The absence of any negation operator from that table may be a mistake, or it may not. It may be the intention that one has to write \+(Goal). ] If we take the lexical rules (specifically L0,L1,L2) and rule 0 of the grimoire literally, any program containing a comment is illegal. The lexical rules very clearly classify comments as tokens, and as clearly return ALL the tokens, and rule 0 takes ALL the tokens and feeds them to the grimoire rules. There is nothing in the grimoire which says that comments are dropped! (If you are interested in seeing a grammar where this sort of thing is done *right*, I suggest that you look at The Turing Programming Language, Design and Definition. R.C.Holt, P.A.Matthews, J.A.Rosselet, & J.R.Cordy Prentice-Hall 1988 ISBN 0-13-933136-0 US$ 27 I don't like the programming language at all; but I admire they way they went about defining it.) With all due respect, may I suggest something to any members of the BSI committee who may be reading this newsgroup? In an attempt to make DEC-10 Prolog syntax even more of a de-facto standard than it was, while I was at Edinburgh I wrote a lexer and parser for DEC-10 Prolog, and gave them away free. I still do; at SLP87 the lexer was one of my examples. Quintus Prolog, SICStus Prolog, and SB Prolog all use read/1 predicates based on the parser. (There are lexical differences between SB Prolog and the others, but not syntactic differences.) I have heard of other Prologs which use this parser, but have no confirmed names. The BSI committee should do the same. There are two reasons for this. The first is that anybody can get a usable Prolog system *free* which is much more compatible with DEC-10 Prolog than anything the BSI committee have ever suggested, and this Prolog system is superior to BSI Prolog in every other way too. BSI Prolog has to provide at least this much leverage: someone who wants to switch from ALS Prolog to SB Prolog or Quintus Prolog Prolog has very little to do, but switching over to BSI Prolog will require a lot of work. If there is a free implementation of BSI syntax, as there is of Edinburgh syntax, that will considerably facilitate the conversion. But the second reason is more important. The BSI committee should get the bugs out of their grimoire as early as possible. Actually writing a program and running both the old DEC-10 Prolog library and a large collection of random data through would help a lot. It has taken me a lot longer to type this message than it took me to find the problems I have listed above, but a computer has better proof-reading skills yet! The BSI committee started work in 1984. I know that it takes a day to get a DEC-10 Prolog lexer written and tested. Does it really take a comittee over three years to get this far? Would it have taken that long if they hadn't decided to design their own language? At the Boston Logic Programming conference, Frank McCabe assured us that the standard would be ready the following year. What a good thing it wasn't, eh?
bimbart@kulcs.uucp (Bart Demoen) (03/08/88)
R. O'Keefe writes: > Chris Moss's recent posting on the syntax part of the BSI Prolog SubStandard > deserves a detailed response, which I intend to provide in the coming week. > For the record, here's a brief response: > (1) about half of the programs I have ever written would not be > accepted by a BSI-compatible reader. I hope the "detailed response" will not consist of excerpta from half of the programs of R. O'Keefe, but I am afraid it will ... But less egocentric comments on Chris Moss's recent posting, I am interested in.
ok@quintus.UUCP (Richard A. O'Keefe) (03/08/88)
As in article <737@cresswell.quintus.UUCP>, I'll refer to the document BSI/IST/5/15 > ISO/IEC JTC1 SC22 WG17 "Draft for comment: Feb 1988" as "the grimoire". (In case you didn't already know, the word "grimoire" is derived from "grammar", as is "glamour".) Before getting to the examples, I'd like to quote a sentence from Chris Moss's message. He says of the current BSI syntax that It is much closer to "Edinburgh" than some previous proposals, but does try to regularize some of the obvious defects. Absolutely true, though perhaps I don't interpret the second clause the way he does! It would be very interesting to know why the earlier proposals were discarded. I know why _I_ would have discarded them, but why did the BSI committee discard them? Now let's look at some examples. -------------------- The end-of-term token -------------------- One of the more visible changes is that "." is a perfectly good token in Edinburgh Prolog, and the end-of-term token is ".<layout>", but in BSI Prolog the end-of-term token is ".". What does this buy us? (1) It breaks most of Lee Naish's programs (because he uses the fact that you can declare '.' as an infix operator, and use it as infix cons), and some of mine (because I tend to type "write(.)", "functor(X, ., N)", "X = .(A,B,C)", and other things that rely on "." being a valid token). The BSI committee may possibly see this as an advantage. I don't. Ok, sometimes you have to trade costs against benefits; but if you want to break my code, you had better offer me some pretty good benefits! (2) Here is an example which is not legal in Edinburgh Prolog, but is legal in BSI Prolog: p(a).p(b).p(c). This is a benefit? In fact, this benefit is illusory as well as dubious. Suppose you want to put a PL/I-style comment after a clause. In BSI Prolog, you have to write p. /*that-space-was-needed!*/ because p./* is, according to the Lexical Syntax (in particular, rule L7), two tokens, "p", and "./*", just as in Edinburgh Prolog. (There is a similar problem in C++: i = j//*oops*/k; is equivalent to "i = j/k;" in C, but because C++ has // end-of-line comments, it is equivalent to "i = j" in C++. I just wanted to point out that this kind of problem is not specific to Prolog.) In Edinburgh Prolog, you know that you have to have a layout character at the end of every clause, so such problems are automatically avoided. There is a similar problem with :- op(10,fx,a).:- op(10,fx,b). ^^^ where .:- is a single token in BSI Prolog, just as it is in Edinburgh Prolog. (3) Well, maybe the change is for the sake of implementors, to make it easier to write tokenisers? Sorry, but it does the exact opposite. Edinburgh Prolog lexical syntax can require two characters of LOOKAHEAD. For example, when we see the sequence "2." we have to look at the next character, and if it is not a digit, we have two characters in hand that do not form part of the current token. However, Edinburgh Prolog doesn't require any characters of PUSHBACK. That is, when an Edinburgh Prolog parser reads a term, it reads all and only the characters which comprise the term. The layout character which is part of the end-of-term token is by definition part of the clause, and is and should be consumed by read/1. So if I do ?- get0(C). * the answer I get is C=42, because the space between the dot and the asterisk was part of the end-of-term token, and was properly consumed. Now when a BSI Prolog parser reads a term, it has to look at the character following the dot, because the query might have been ?- get0(C)..put(C). Having found that the character after the dot is NOT part of what the grimoire calls "a graphic symbol", a BSI Prolog parser had better put that character back in the input stream, otherwise it will have read something which was NOT part of the term! By the way, did you notice that I just gave an example of a query which cannot be transliterated into BSI Prolog? Suppose I try ?- get0(C). * in a BSI Prolog system. (Actually, get0/1 wasn't a BSI built-in the last time I looked, but let's pretend that it is possible to synthesise it.) The answer will be C=32, because the space after the dot is NOT part of the clause in BSI Prolog, so should not be read by read/1, so SHOULD be read by get0/1. Well, that's not what I want. How about ?- get0(C).* Sorry, "*" is a "graphic character", so ".*" is a single token, so that's not going to work either. (Yes, it is obvious how to make it work, but it's not a simple matter of transliteration.) Anyway, the point is that it is possible to write a very fast tokeniser for Edinburgh Prolog, without requiring the host I/O system to support pushback, and without having to simulate pushback, but that this is NOT possible for BSI Prolog. The formatted input facility which was being proposed last year required unbounded pushback (I am not kidding), so the BSI may not regard pushback as a problem. -------------------- No syntax for character lists -------------------- BSI Prolog introduces strings. Interfacing to Lisp or Pop or Basic or some other language which has strings is a perfectly sensible thing to do, so strings have a place in the standard. I don't even mind the fact that BSI syntax is not compatible with Arity Prolog. With someone from ESI on the committee and no-one from Arity it was inevitable that the standard would resemble ESI Prolog rather than Arity Prolog. But let's look at a sensible example of Edinburgh Prolog code. usa_phone(Area,Exchange,Number) --> "(", digits(3, Area), ") ", digits(3, Exchange), "-", digits(4, Number). This won't work in BSI Prolog, because "(" and so on are strings, not lists. Fair enough: I'm prepared to change "(" to $($ or #(# or whatever. But there isn't anything for me to change them to! If Quintus Prolog didn't have double quote, I could write [0'(], [0'),0' ], and [0'-] -- which comes from DEC-10 Prolog, as I've mentioned before -- but BSI Prolog won't let me do that either. No, I have to write the character codes as integers. As I wrote it, usa_phone works just fine in Quintus Prolog on an IBM 370 (using EBCDIC). With base zero, it works just fine in EBCDIC. But BSI Prolog forces you to write the ASCII codes or whatever. Yes, I know old versions of Quintus Prolog didn't support base zero, but it should have, to be compatible with DEC-10 Prolog. I think the way ALS handles character codes is unduly clumsy, but if that were to be the standard I could put up with it. Strings are not an adequate substitute for lists of character codes, and having a simple syntax for strings is no excuse for not having ANY syntax for lists of character codes. It makes me wonder if anyone on the BSI committee uses grammar rules. It might be argued, though, that there are only so many characters, so as new features (however dubious) are added, old ones must go. This is not true, as it happens. There is a very satisfactory solution. I leave it as an exercise for the BSI committee to work out what that solution is. (Hint: it makes the language more powerful, not less, and substantially simplifies forward conversion.) One minor gripe about strings and quoted atoms is that DEC-10 Prolog followed the Fortran/SNOBOL/... convention of doubling the quoting character, e.g. 'don''t' "quote ""me"" here". BSI Prolog partially follows the C convention. Adding C-style \escapes is one thing; breaking old code is another. Why not allow the old convention as well? Quintus Prolog does this, and it isn't hard, not at all. -------------------- A much-needed gap -------------------- Something many people have complained of is the fact that there is no standard way of reading a term without an end-of-term token. It would be quite straightforward to provide this for Edinburgh syntax, at the price of two characters of pushback. You may be familiar with the "syntax" of Prolog in the DEC-10 Prolog manual. It looks like term-read-in --> subterm(1200) end-of-term subterm(N) --> term(M), {M =< N}. ... term(0) --> functor'(' arguments ')' | list | ... | number ... It would be straightforward to provide a built-in predicate read0(-Term0) which would enter the grammar at the term(0) point, and would share the read/1 characteristic of reading all and only the characters of the term of interest. Why are two characters of pushback needed? Because the input might look like 2.@ where we cannot tell until we've read the @ that the input isn't 2.0. By juggling with definitions, we could get this down to one character of pushback: we might, for example, rule that <integer>. in _this_ context, was equivalent to <integer> alone. Could we do this with BSI Prolog? No. Because BSI Prolog is intended to allow arbitrary amounts of layout and comment between a function symbol and its left parenthesis (I say "intended to", because the grimoire doesn't actually allow comments anywhere), it requires unbounded lookahead (hence, in this context, unbounded pushback) to distinguish between f and f (1) This is not a case of breaking anything that currently works. It IS, however, a case of "filling a much-needed gap". -------------------- The strange case of -3 -------------------- This isn't in the grimoire. It's "Syntax Issue Op5". The question is this: how are negative numbers to be handled? The answer in Edinburgh Prolog is a wee bit tricky, but it works very nicely in practice. A number like -3 is read as two tokens. In a context where a prefix operator would be allowed, -3 is converted by the parser to a single number, and in that case - binds more tightly than any other operator and cannot be disabled. In a context where an infix operator is expected, -3 remains as two tokens. In Edinburgh Prolog, it follows that X is-I mod-2-3 is parsed as is(X,-(mod(-(I),-2),3)) What does "Syntax Issue Op5" say? Adopted 2 all syntactic cases of negative numbers must be converted, but built-in predicates treat - as functor. I'm sorry, but this is just as puzzling to me as it is to you. What I _think_ it means is that -3 is to be read as a single number, and that integer(-3) and atomic(-3) are to succeed, but that functor(-3, -, 1) is ALSO to succeed. WALLOP! Nearly all of my programs just bit the dust! There used to be this nice little property you could trust: for all C, atomic(C) <-> nonvar(C) & functor(C, C, 0) I don't see any sense in destroying this property. Anyone who wants -(3) can WRITE -(3). I hope I have misunderstood. What does the grimoire say? Well, it's rather interesting. L13 number = ["-"], digits, [".", digits], [exponent] ; L18 digits = digit, {digit} ; L14 exponent = ("E" | "e"), ["+" | "-"], digits ; There's a rather nasty problem here, in that it requires unbounded lookahead to recognise one of these things: 1e0000000000000000000000000000000000000000000000000 is a number, but 1e0000000000000000000000000000000000000000000000000_ is 1 with the atom 'e0000000000000000000000000000000000000000000000000_' after it. The Turing definition adopts the principle of maximal scan to rule such things out as lexical errors, but the grimoire does NOT adopt the principle of maximal scan and tries to handle such things directly in the grammar. However, the real snag with this is that it breaks code such as three_less(N, M) :- M is N-3 . because that -3 is now a single token. Maybe it's not what the BSI committee _mean_, but it's what the grimoire _says_. -------------------- Enough for today -------------------- There is a minor glitch: the grimoire requires the ISO 8859 character with code 223 to be treated as an upper case letter. The DIS 8859 draft I was sent as one of the BSI documents says that the character with that code is the German "sz" character (looks like a Beta), which is a lower-case letter. This may be due to me having an old draft: if someone knows where I can get the latest draft or the actual standard, could you tell me? I had a lot more that I meant to say in this message, arguing the point that trying to define the syntax of Prolog as if it were Pascal is the wrong _kind_ of definition, but this message is already too long. Am I being unjust to the grimoire? After all, it _is_ labelled "draft for comment". Bearing in mind that the BSI committee was set up in 1984, I don't think I'm being unjust.
cdsm@doc.ic.ac.uk (Chris Moss) (03/08/88)
> In reply to: ok@quintus.UUCP (Richard A. O'Keefe) > Chris Moss's recent posting on the syntax part of the BSI Prolog SubStandard > deserves a detailed response, which I intend to provide in the coming week. > For the record, here's a brief response: Ok, I'll take the bait!. Before I have to reply to a longer one :-) > (1) about half of the programs I have ever written would not be > accepted by a BSI-compatible reader. I suspect the reason for this comment is the existence of various types of odd directives in his programs which are not catered for in the standard. > (2) they've made several changes to operators. Amongst other things, > a postfix operator cannot be any other kind of operator as well, > but another change they made means that this restriction is > pointless. The reason for the changes is to standardise the interpretation of operators: in particular to ensure that one can't write ambiguous expressions. It may well be that the current proposal is too restrictive. If so, I'd like to see a PROOF that any other proposal works. (At last count I'd had 12 replies to my 'operator tests' posted to the net and no two of them are identical!) > (3) The attempt to describe Prolog control structures as *syntax* > is fundamentally misdirected. I assume that by 'control structures' you mean operators such as 'and' and 'or' -- I hate the word 'control' for these, they're supposed to be logical though of course they do have a control component! There are 2 basic reasons for 'freezing' the basic logical structures: 1. Manufacturers demand some consistency; they don't want to find that someone has inverted the priorities of 'and' and 'or' so a program means something entirely different to the way it reads. 2. To enable Prolog readers to be written with decent error checking and reporting facilities. I've recently seen a research report comparing Prolog with several other programming languages and it sure comes off worse! The basic reason is that EVERYTHING is operators: all a parser can say is "Syntax error ... here". > (4) The basic structure of the BSI approach to syntax has been > to cut the Gordian Goose. That is, instead of regarding the > (actually rather low) diversity of Prolog syntax as a an > opportunity to be solved by making the language more powerful > (e.g. having a table-driven tokeniser), it has been treated as > a problem to be solved by inventing a new, more restricted, > language. You may think there is low diversity but I would say the opposite. Whereas the basic semantic ideas of Prolog are well-defined there have been a number of entirely different syntaxes used to express that (I can think of at least 6 - the original Marseille, Edinburgh, microProlog, Waterloo/VM, LM Prolog, Prolog II) of which all but the first are commercial products. What other language has this diversity? The fact that one of these is undoubtedly dominant shouldn't blind you to the reservations that the others have about the 'rough edges' of the Edinburgh system. Enough for now! Chris Moss.
ok@quintus.UUCP (Richard A. O'Keefe) (03/10/88)
In article <1175@kulcs.kulcs.uucp>, bimbart@kulcs.uucp (Bart Demoen) writes: > I hope the "detailed response" will not consist of excerpta from half of the > programs of R. O'Keefe, but I am afraid it will ... > But less egocentric comments on Chris Moss's recent posting, I am interested in. D'y'know, playmates, I get the feeling that he doesn't like me. Sniffle. Here is a list of Prologs which are more similar to each other than any of them is to the BSI syntax: DEC-10 Prolog, C Prolog, Quintus Prolog, SICStus Prolog, Stony Brook Prolog, ALS Prolog, Arity Prolog. I haven't any documentation on PopLog, but when I last used it in 1984 it used 8-bit priorities as did NU7 PDP-11 Prolog (but there is some way of making it use DEC-10 Prolog compatible numbers) and had fancy syntax for call/N (I think it was Pred$(X,Y,Z) where Quintus Prolog would use call(Pred, X, Y, Z)). In both respects it departs from the BSI syntax. Apart from those respects, it was then more similar to DEC-10 Prolog than to the BSI syntax. I don't mind converting my own code. But my salary ultimately comes out of the pockets of people who buy Quintus Prolog, and a lot of them buy our Prolog because it hews so close to the Edinburgh line. Does defending the interests of these people really count as "egocentricity"? Really? Does saying that the BSI Prolog committee has no divine mandate to break Lee Naish's code constitute "eogcentricity" on my part? Does commending Arity Prolog's solution to the string/character-list problem? AAIS Prolog for the Macintosh is very different from Edinburgh Prologs (for example, the body of a clause is converted to a list) but they have provided enough hooks to customise their tokeniser so that you can get very close to Edinburgh syntax. I claim that AAIS have taken a responsible approach to such syntax changes, and that the BSI comittee ought to study the AAIS Prolog manual very carefully. Is it "egocentric" of me to say that? Perhaps Bart Demoen thinks it egocentric of me to express any opinions at all. A word of warning: Chris Moss's posting was two pages out of 25. Many of my comments were directed at aspects of the grimoire not covered in that posting. I think the topic is of sufficient interest to warrant posting the entire thing, but that's for the BSI committee to do/authorise. A big problem with the grimoire is that it is presented as a grammar for _program_ _files_, not as a grammar for terms. For example, rule 0 of the grimoire is 0 file :- tokens, program ; Input Chars Chars Tokens Output Abstract Tokens Abstract and the accompanying text says "the production 'tokens' converts a list of characters to a list of tokens. The production 'program' converts that list to a tree which represents the abstract syntax of the program." That is, what is defined is a mapping from entire files to trees (specifically, a file is mapped to a list of clauses). Did you notice that this leaves read/1 undefined? We don't need a grammar for Prolog. We need a definition of read/1, and a definition of what a term means when considered as a clause. That will be of some use to someone who writes a parser for some different input language, but the present grimoire will not. Here is something you can do in some Prolog systems. (You could not do it in compiled Dec-10 Prolog, and you can only do it with extreme difficulty in Quintus Prolog.) Suppose you want an analogue of C's #if. facility. Then you can, in some Prolog systems, define if(Condition) :- ( Condition -> true ; read(_) ). and then :- if( 1 = 1 ). <term>. will leave <term> in the input stream for compile/1 or consult/1 to notice, but :- if( 1 = 2 ). <term>. will consume <term> so that compile/1 or consult/1 will never know it had been there. The grimoire forbids this, which won't make Quintus particularly sad. The trouble is that it is not clear that the committee ever thought of it. It's not one of the Syntax Issues that accompanies the grimoire. The grimoire doesn't explicitly forbid it, but implicitly, because it defines a meaning for the file which always includes both the call to if/1 *and* the following term. Here's another example of the same kind, only worse. In some of my Pop-2 programs, I used to write until .cucharin = ## do enddo; <commentary> # Now, in some Prolog systems, you can do the same sort of thing: :- skip(0'#). <commentary> # where the stuff between the command and the sharp sign can be any old sequence of characters, and need not follow Prolog syntax. The grimoire clearly forbids this: an entire file _must_ be a sequence of clauses. Bearing in mind that this can't be done at all in compiled DEC-10 Prolog, and that it requires low cunning and much effort to do in Quintus Prolog, I'm certainly not going to argue that the standard should _require_ all implementations to permit source files to skip parts of themselves. But there are good reasons why someone might want to do this, so I think the standard should be so constructed as to _permit_ it. (The difficulty in Quintus Prolog is finding the right stream, not doing the skipping.) It is surely clear that someone reading a file of data might want to use read/1 for some things and get0/1 for others _in the same file_, but the grimoire makes no provision for that. As it is currently phrased, there is nothing in the grimoire to stop a source file closing itself: :- seeing(X), seen. That doesn't matter. The grimoire still says quite clearly that the whole file must be processed. Prolog is the same kind of language as Lisp or Pop. Common Lisp is not defined by giving a grammar for "programs" (there being no such animal). It is defined by defining 'read'. That's how to define Prolog.
ok@quintus.UUCP (Richard A. O'Keefe) (03/11/88)
With respect to operators, the current draft of the grimoire makes three changes from DEC-10 Prolog. (1) Programmers are not allowed to define operators at or above the level of comma. This means that the type checker I posted to this group would be illegal, as it declares operators 'type' and 'pred' which resemble the 'mode' operator of DEC-10 Prolog. It would also seem to have the effect of rendering any Prolog system which accepts DEC-10 Prolog 'mode' declarations non-conforming; the grimoire lists _all_ the operators it considers to be standard. (The omission of the "existential quantifier" "^" and the "integer division" operator "//" from the table on page 5 are, I trust, unintentional.) The possibility of such operators does nothing to decrease the ease with which Prolog can be parsed. I can't find the issue sheet where the reason for the change is given, but I _surmise_ that it is meant to make the language more comprehensible to people who try to read Prolog as if it were Pascal. It's a relatively minor change to the language, so we might let it pass without comment. But as Steve Hardy has pointed out: read/1 is not exclusively used to read Prolog programs! As an example, I[*] have a toy expert system shell (well, it's modelled on something which is sold as such) where you declare variables as e.g. company_type(year)/askable, company_type: cotype. cash_flow(year), working_capital(year): number. In order to make this work, I have to move ':' from its usual place in Quintus Prolog to just above ','. It is not clear to me that the BSI committee are acting in my interests when they act to break this program. Is it reasonable to (ab)use read/1 like this? Too right it is! That's what it's for! [*] Saying something about one of my programs won't please Bart Demoen. If he'll give me one of his programs, I'll be delighted to criticise it here. I can't promise that _he'll_ be delighted, though! (2) If an atom is declared as an operator, it is not allowed to be an operand of another operator. This is a breath-takingly elegant solution to the operator ambiguity problem. If Prolog were still in the design phase, I'd be applauding this as a really good idea. If this were the only difference between BSI Prolog and Edinburgh Prolog, I'd whinge a bit, but in a couple of years I'd be claiming that I'd thought of it first. The Edinburgh Prolog feature of requiring a prefix operator to be separated from a left parenthesis lest it be mistaken for a function symbol is retained. (3) Further changes have been made in the interests of making the language parseable in C. Specifically: o "A right-binding operator binds more strongly than a left-binding operator with the same declared priority number". I don't know what this means. Perhaps it means that if :- op(500, xfy, p), op(500, yfx, q). then a p b q c will be parsed as p(a, q(b,c)), which is just what Quintus Prolog does now. o "One cannot have two operator definitions for the same symbol current at the same time, with the exception of the combination of the pair: infix and prefix. This is to allow such uses as unary minus, which are firmly embedded in mathematical notation". The main point of this message is that this last restriction is unnecessary. There was an article in SigPlan Notices, some time between 1978 and 1982, which solved just this problem: you have operators which can be prefix, infix, postfix, and you have operands. How do you pick the right interpretation of the operators? Using A to stand for constants, bracketed terms, and other subterm(0)s, p to stand for a prefix operator, i for an infix operator, and s for a postfix operator (read "suffix"), a term must look like p* A {s* i p* A} s* That is, everything to the left of the left-most A must be a prefix operator, everything to the right-most A must be a postfix operator, and every string of operators between two As must be s* i p*. Note that this is not so easy in Edinburgh Prolog, because we haven't got those nice fixed As. In practice, we almost always have, which is why backtracking is not a significant cost in actual Prolog-in-Prolog parsers, but we _might_ have an an arbitrarily long input term with no fixed atoms. In BSI Prolog, because of change (2), this can't happen. So what ambiguity remains? Suppose we have two As with n operators between them. There is an n-fold ambiguity: any one of them could be the 'i', but once that is chosen, the others are fixed. What say we pick the left-most possible choice in such a group? This isn't necessarily the left-most possibly infix operator. If we have a postfix/infix operator followed by an infix-only one, it is the latter which is the left-most possible choice. If we want to be even more thorough, should there be multiple possible choices, we can use the fact that the infix operator must dominate the postfix operators preceding it and the prefix operators following it, and if there should be two or more choices remaining, we could report that as a syntax error. Why go to this sort of effort? (Since Prolog terms are _much_ shorter than Pascal programs, it is easy to over-estimate the effort required.) Well, there is a style of Prolog coding which uses operators extensively. I refer, of course, to the "SIMPLE" syntax of LPA Prolog. In that notation, one normally treats all unary predicates as postfix operators, e.g. (john runs) and all binary predicates as infix operators, e.g. (john runs bsi) I don't happen to like this, but I don't see all that much point in forbidding it either, given that it has hitherto been permitted in Edinburgh Prolog. Note that in a term such as (john runs -> john isa runner) the only possible reading is A s i A i A because -> has no prefix reading. CONCLUSION: it is possible to permit infix/postfix operators without losing deterministic parsing, thanks to the no-operators-as-operands restriction.
ok@quintus.UUCP (Richard A. O'Keefe) (03/14/88)
In article <227@gould.doc.ic.ac.uk>, cdsm@doc.ic.ac.uk (Chris Moss) writes: > Ok, I'll take the bait!. Before I have to reply to a longer one :-) > > (1) about half of the programs I have ever written would not be > > accepted by a BSI-compatible reader. > I suspect the reason for this comment is the existence of various types of > odd directives in his programs which are not catered for in the standard. > As it happens, this is not the reason. It simply didn't occur to me when I was writing that particular message to think of directives at all. I had in mind ordinary clauses.
ok@quintus.UUCP (Richard A. O'Keefe) (03/14/88)
In article <227@gould.doc.ic.ac.uk>, cdsm@doc.ic.ac.uk (Chris Moss) writes: > > (3) The attempt to describe Prolog control structures as *syntax* > > is fundamentally misdirected. > > I assume that by 'control structures' you mean operators such as 'and' and > 'or' -- I hate the word 'control' for these, they're supposed to be > logical though of course they do have a control component! > 1. Manufacturers demand some consistency; they don't want to find that > someone has inverted the priorities of 'and' and 'or' so a program means > something entirely different to the way it reads. Freezing the priorities of those operators is all that is required for this consistency. > 2. To enable Prolog readers to be written with decent error checking and > reporting facilities. I've recently seen a research report comparing > Prolog with several other programming languages and it sure comes off > worse! The basic reason is that EVERYTHING is operators: all a parser can > say is "Syntax error ... here". A parser can actually give much more precise messages than that. Was Lisp one of the languages Prolog was compared with? There really isn't much a Lisp parser can tell you either. Note that a syntax error in line 1 of a Pascal program can completely derail the parser for the rest of the file. The units that a Prolog parser works with are seldom more than a few lines long. > You may think there is low diversity but I would say the opposite. Whereas > the basic semantic ideas of Prolog are well-defined there have been a > number of entirely different syntaxes used to express that (I can think of > at least 6 - the original Marseille, Edinburgh, microProlog, Waterloo/VM, > LM Prolog, Prolog II) of which all but the first are commercial products. The mandate of the BSI committee was EXPLICITLY to derive a standard based on EDINBURGH Prolog. I'm not saying anything against the other languages, just pointing out that the actually piece of paper which said in '84 what the BSI committee was going to do said Edinburgh Prolog. So what the other languages do is irrelevant.
cdsm@doc.ic.ac.uk (Chris Moss) (03/22/88)
Forwarded for Roger Scowen -- KRG0@gm.rl.ac.uk RESPONSE TO COMMENTS FROM RICHARD O'KEEFE ON PROLOG STANDARDIZATION GENERAL RESPONSE Richard O'Keefe started by saying that he would respond to the mailing from Chris Moss. In fact many comments refer to a document (Prolog syntax, Draft 4.1) that most news readers (and members of the ISO and BSI panels) will not have seen. This seems somewhat unfair on readers who will be unable to judge whether draft, criticism, or rebuttal is justified. First some general comments. The objective is to define an International Standard for the programming language Prolog. This means that standard conforming programs will run correctly on standard conforming processors, neither more nor less. It will not limit implementers from introducing new features and facilities into their Prolog compilers. Neither will it mean programmers cannot use such extensions; only that if they do, their programs will not conform to the standard. But a standard will permit people and companies to write applications and libraries that will run on any conforming implementation and thus give them a framework in which to work. In particular, such users and their customers will not be restricted to a single implementation. A standard will also give teachers, authors and students a common core of useful Prolog. Once a feature has been included in a standard, it is almost impossible to remove. The committee remembers that Fortran has been burdened with arithmetic if statements and computed goto statements. In Prolog we hope to avoid such legacies if possible. So some features of Edinburgh Prolog will not be in the standard because although they fulfilled a need at one time, they are not a sensible longterm solution. Now some replies to specific criticism. DIVERSITY OF EXISTING PROLOG SYSTEMS > (4) The basic structure of the BSI approach to syntax has been > to cut the Gordian Goose. That is, instead of regarding the > (actually rather low) diversity of Prolog syntax as an > opportunity to be solved by making the language more powerful > (e.g. having a table-driven tokeniser), it has been treated as > a problem to be solved by inventing a new, more restricted, > language. Well, yes and no. Chris Moss has produced tests that give different results on every system tested so far. Perhaps there is rather more diversity than Richard O'Keefe realizes. One objective has been to define a syntax where many existing systems would not generally disagree on the meaning of standard-conforming programs. PROLOG CONTROL STRUCTURES AS SYNTAX > (3) The attempt to describe Prolog control structures as *syntax* > is fundamentally misdirected. This is a matter of opinion. One reason for regarding Prolog control structures as *syntax* is so that a person or program reading a Prolog program can always recognize its overall structure. NOTICE OF MEETINGS > (By the way, I > *do* wish the BSI crowd would send me these things so that they arrived > at least a week in advance of the meetings, instead of a day or two > afterwards. I have never yet received notice of a meeting in time to > send written comments for the meeting to consider. I haven't much > confidence that they'd be read if I did send them, but it'd be nice to > be given the chance.) Meetings are planned and advertised several months in advance, for example, the following meetings are already planned: BSI, London on Thursdays 2nd June, 1st Sept, 1st December 1988. Any extra meetings to discuss the syntax will be on the previous day (i.e. 1st June, etc); any meetings to discuss built-in predicates will be a week later, i.e. 9th June, etc. Everyone who wishes to attend is welcome. I admit that pressure of work means that some papers are sent only a week before the meeting. This is ample for British members of a British panel, but not for Californian members. But other papers will have been sent four or five weeks earlier. All comments, whether they are received before or after a meeting, are read and considered. ',' and '&' AS OPERATORS > Oddly enough, if one takes the grimoire literally, the user CAN > declare ',' and '&' as operators, and can use them in that form. > However, ',' and '&' cannot possibly have the same precedence as > "," or "&" in BSI Prolog, and it seems clear that (A ',' B) and > (A '&' B) must be different terms. It is not intended that it will be possible to declare ',' and '&' as operators. A MISTAKE IN COMMENTS /** By L22, this is not a legal comment **/ Thank you. This will be a valid comment in standard Prolog despite the error in this draft. QUOTE OPERATORS USED AS OPERANDS > compare(R, X, Y) :- > ( X @> Y -> R = > > ; X @< Y -> R = < > ; R = = > ). Richard O'Keefe realizes that the above example is intended to be syntactically incorrect in the standard. When operators are used as operands, there many problems of possible ambiguity. A cure is still under discussion, but some problems are avoided by the rule that "An operator used as an operand must be bracketted". AN INFIX CONS OPERATOR > The following clause, valid in Edinburgh Prolog, is illegal in BSI Prolog. > append(H.T, L, H.R) :- > append(T, L, R). We are still considering the problems posed by the multiple uses of '.', i.e. as a decimal point, as an infix cons operator, and as a clause terminator. At the same time we desire to make layout characters unimportant in determining the meaning of a program. Several possibilities have been suggested and are under consideration. NEGATION > not Goal :- % "not" is not a built-in operator > ( ground(Goal) -> \+ Goal % neither is "\+". > ; signal_error(instantiation_fault(Goal,0)) > ). > > [The fact that the grammar proper has no negation proper is no accident. > The grammar in the grimoire defines only operators with at the level of > comma and weaker. Negation binds more tightly than comma, so it would > have to be in the table of built-in operators. The absence of any > negation operator from that table may be a mistake, or it may not. It > may be the intention that one has to write > ] It is intended that Standard Prolog will not contain 'not' or '\+'. Standard Prolog will not require systems to implement true logical negation and it would be misleading to include an operator or predicate that implies that they have done so. Instead the way is left open for processors to implement a version of 'not' as an extension and still remain standard conforming. Standard Prolog will contain a built-in predicate that implements 'negation by failure', i.e. fail_if(G) :- call(G), !, fail. fail_if(_). A PARSER AS STANDARD > In an attempt to make DEC-10 Prolog syntax even more of a de-facto > standard than it was, while I was at Edinburgh I wrote a lexer and > parser for DEC-10 Prolog, and gave them away free. I still do; at > SLP87 the lexer was one of my examples. Quintus Prolog, SICStus > Prolog, and SB Prolog all use read/1 predicates based on the parser. > (There are lexical differences between SB Prolog and the others, > but not syntactic differences.) I have heard of other Prologs which > use this parser, but have no confirmed names. > > The BSI committee should do the same. A program that resolves ambiguity implicitly is not acceptable as defining a standard; there must be further definition. One reason is that a program specifies too much. Some features need to remain 'implementation dependent' because we must not specify them, for example: the accuracy and largest values of floating point numbers, or the integer value corresponding to a character. Another reason is that it is harder to understand and find errors. DISCLAIMER AND CONCLUSION Never rely on working papers and draft standards. They are subject to changes and review. All documents and working papers, however confidently expressed, are also subject to review. There will be no standard until the member bodies of ISO have approved it. The next working drafts will incorporate changes arising from further consideration and the comments received (including those from Richard O'Keefe). Many countries, but not at present USA, have national Prolog panels coordinating their views on the emerging standard. I encourage all Prolog implementers and users to participate in this effort in order that the eventual standard is one that preserves the best of the past and also provides development paths for the future. Roger Scowen, 11 March 1988
ok@quintus.UUCP (Richard A. O'Keefe) (03/23/88)
In article <234@gould.doc.ic.ac.uk>, cdsm@doc.ic.ac.uk (Chris Moss) writes: > Forwarded for Roger Scowen -- KRG0@gm.rl.ac.uk > > RESPONSE TO COMMENTS FROM RICHARD O'KEEFE ON PROLOG STANDARDIZATION > > GENERAL RESPONSE > > Richard O'Keefe started by saying that he would respond to the > mailing from Chris Moss. In fact many comments refer > to a document (Prolog syntax, Draft 4.1) that > most news readers (and members of the ISO and BSI panels) will > not have seen. > This seems somewhat unfair on readers who will be unable to judge > whether draft, criticism, or rebuttal is justified. My postings were in fact a response to Chris Moss's mailing. They were not confined to the content of that mailing, true. It seemed to me that Chris Moss's mailing implied that the BSI syntax was in a satisfactory state, and that it wasn't as difference from the de facto standard as people feared. I set out to show that neither of those statements is true, and I believe that I succeeded. Many comments did refer to a document that most news readers won't have seen. But then, most news readers won't have seen ***ANY*** of the BSI documents. Am I then to say nothing? As for fairness to readers, (a) I was quoting from the very latest document I had. Surely it would be more unfair to quote from something I believed to have been superseded? (b) The "February 88" and "Feb 88" documents arrived in my mailbox here in the same week. I had no way of telling who had or had not received the document I was quoting. All I knew was that this was the latest document available, sent to me by the author. (c) In order to permit readers to judge for themselves whether my criticisms were justified, I quoted extensively from the document. I did not ask anyone to take it on faith that this or that was the case: where the grimoire appeared to say something particularly silly I exhibited the rules responsible. This is unfair? > First some general comments. The objective is to define an > International Standard for the programming language Prolog. > This means that standard conforming programs will run correctly > on standard conforming processors, neither more nor less. > It will not limit implementers from introducing new features and > facilities into their Prolog compilers. > > Neither will it mean programmers cannot use such extensions; only > that if they do, their programs will not conform to the standard. > This is a little misleading. The general rule in other languages is that implementors can add extensions, provided that such extensions are either illegal or undefined in the standard. Thus a Pascal compiler can provide alphabetic labels as an extension. But an implementor should not provide an extension which alters the meaning of a program which the standard would have ruled legal. Let's apply this to the case of :- read(_). directives in a file which is being consulted or compiled. Specifically, let's consider a file which looks like :- read(_). p(a). and has nothing else in it. Does this define p, or does it not? The BSI grammar, in all versions, provides the syntax of entire files: according to the grimoire this MUST mean exactly one directive followed by exactly one clause. Since this is a defined and legal file, it would be most improper for an implementor to give it any other meaning. Therefore, reading out of a file being compiled or consulted is NOT a permitted extension. (This wouldn't bother Quintus, but it is legal in some other Prologs.) Let's apply this to another case: functor/3. It has always been the case in DEC-10 Prolog that functor(1, 1, 0). In at least one draft of the BSI built-in predicates document, this has been required to raise an error. (BSI Prolog includes an error handling facility; needless to say it doesn't look like IF/Prolog's or M-Prolog's or ...) So a BSI conforming program is entitled to rely on this error being raised, and an implementor may NOT provide DEC-10 compatibility. The ANSI C committee have found it necessary to explicitly indicate which identifiers may be used by implementors. (The list includes all identifiers starting with "_" or "str" and there are others I can't remember right at the moment.) Why is this? Because the programmer needs a guarantee that the identifiers he has chosen for his code won't be in conflict with an implementation. For example, (not)/1 is not defined in the BSI stuff, so Scowen says that an implementation is free to define it. But if the implementation is free to do so, then the programmer ISN'T. Since setof/3 is not in the BSI Prolog language, a program which defines setof(List, Set) :- setof(List, [], Set). setof([], Set, Set). setof([Head|Tail], Set0, Set) :- ( member(Head, Set0) -> setof(Tail, Set0, Set) ; /* not member(Head, Set0) */ setof(Tail, [Head|Set0], Set) ). is a standard-conforming program. But a Prolog system which is exactly BSI except for providing setof/3 as an extension is a conforming processor. Will such a conforming program run correctly on such a conforming processor? You must be joking. So, taken in their ordinary sense, the claim that "standard conforming programs will run correctly on standard conforming processors", while true of some standards, is NOT true of the BSI work, unless "standard conforming processors" is construed very strictly as meaning "providing NO additional built-in predicates". You will recall that Fortran 77 provides the EXTERNAL and INTRINSIC statements precisely to cope with this problem, and that ANSI C provides the reserved-to-implementors list and #undef precisely to cope with this problem. BSI Prolog does have some reserved words, but is ludicrously far from providing a solution to this problem. > So some features of Edinburgh Prolog will not be in the standard > because although they fulfilled a need at one time, they are > not a sensible longterm solution. Let's be realistic. There are languages on the horizon which are much better approximations to logic programming than Prolog. (NU Prolog has been around for a while.) There are lots of software engineering needs which old Prolog completely failed to address, such as modules. (Last I heard, the consensus of the BSI Modules subcommittee was that they would probably never agree.) I think we ought to regard Prolog as a stopgap; and that the goal of the standard should be to protect EXISTING investments in Prolog. Frankly, advocates of BSI Prolog, with its use of user-supplied atoms as stream names, are in no position to talk about sensible solutions. ************************************************************************ ** It would be most interesting to have an explicit list of the features ** of Edinburgh Prolog which fulfilled a need at one time and are now ** disliked by the committee, and a description of their replacements. ************************************************************************ > > (4) The basic structure of the BSI approach to syntax has been > > to cut the Gordian Goose. That is, instead of regarding the > > (actually rather low) diversity of Prolog syntax as an > > opportunity to be solved by making the language more powerful > > (e.g. having a table-driven tokeniser), it has been treated as > > a problem to be solved by inventing a new, more restricted, > > language. > > Well, yes and no. Chris Moss has produced tests that give > different results on every system tested so far. Perhaps there > is rather more diversity than Richard O'Keefe realizes. > One objective has been to define a syntax where many existing > systems would not generally disagree on the meaning of > standard-conforming programs. The amount of diversity one perceives depends on which "Prolog" systems one decides to include in one's sample. My sample includes only systems whose implementors _tried_ to be Edinburgh (or at least Clocksin & Mellish) compatible. For example, AAIS Prolog is openly and frankly not an Edinburgh-compatible system. We may (and should) look to it for ideas, but we should not include it in a sample of "Edinburgh compatible" Prologs. BIM Prolog has its own unique syntax; while we should perhaps include the '-c' syntax of BIM Prolog in the sample, we should not include BIM Prolog's native syntax. If we go by numbers, then Turbo Prolog should determine the syntax of standard Prolog. If not by numbers, by what? Simple justice suggests that the Prologs to look at are the Prologs whose authors TRIED to be compatible with one another. Prudence suggests the same sample. But even if the diversity among the Prologs whose authors didn't suffer from NIH-itis is much greater than I believe, that doesn't answer my point. What I said was that the diversity should be regarded "as an opportunity to be solved by making the language more powerful (e.g. having a table-driven tokeniser)". [As an aside, this is no more than Lisp and PopLog already have.] It turns out that it is quite easy to write a tokeniser which can handle all of ALS Prolog Arity Prolog BIM Prolog native syntax C Prolog DEC-10 Prolog PopLog (nested comments) Quintus Prolog Stony Brook Prolog and can almost handle ADA [ADA is no longer a trademark], simply by fiddling with a table. AAIS took exactly this approach (though their tokeniser is not as flexible as mine). I found it necessary to support several kinds of quotes in my tokeniser: ATMQT - the quoted thing is an atom (') STRQT - the quoted thing is a string ($) LISQT - the quoted thing is a list (") CHRQT - the quoted thing is a character (`) Suppose the standard were to adopt this approach, then they could rule, if they wished, that the standard assignment was "->STRQT, with nothing being assigned LISQT. That needn't prevent me reading my existing code: I'd be able to change the table while reading my old files. [The best approach seems to be to associate a read table with a stream; naturally this is the approach PopLog takes.] What I have in mind here is that a file would start with a directive such as :- use_syntax(dec10). or :- use_syntax(standard). or :- use_syntax(als). Especially if the tokeniser were made available to user code (as it is in the DEC-10 Prolog library, or built-in in NU Prolog), the result would be a much more powerful language at very little cost to the implementor. And conversion from old dialects to the BSI dialect would be enormously simplified. Do we need to come up with a "best possible" tokeniser for the standard? Of course not. Again, what are we to do about syntactic variations, such as the treatment of operators? My answer, in 1984, was that the standard should not specify read/1 and write/1, but should specify standard_read/1 standard_write/1 and should allow users to redefine read/1 and write/1, but require that the initial definitions be the standard one. consult and compile should use read/1, not standard_read/1, so that someone who wanted to read M-Prolog files into standard Prolog could do so by suitably defining read/1. Now, if you are a self-appointed standards committee member determined to impose your vision of what is a "sensible longterm solution" on every Prolog user whether they like it or not, this sort of approach won't seem all that attractive. But if, like me, you think that the people who matter in all this are the people who have paid money to USE Prolog, and if, like me, you think that the fact that M-Prolog is appalling is no reason to make life any harder for people with a lot of data in M-Prolog format than we have to, you'll think that letting people do read(Term) :- magyar_read(Term). is obviously the way to go. (It doesn't much matter how you install your own code in the hook, the important thing is that there should be a read-hook where you can install your own reader to be used by compile and consult.) > PROLOG CONTROL STRUCTURES AS SYNTAX > > (3) The attempt to describe Prolog control structures as *syntax* > > is fundamentally misdirected. > This is a matter of opinion. One reason for regarding Prolog control > structures as *syntax* is so that a person or program reading > a Prolog program can always recognize its overall structure. It is not a matter of opinion. Either I am right about this, or I am wrong. There is a very important reason for my belief: Prolog is simply not the sort of language for which this kind of thing can WORK. Consider the difference between foo(X, P, Q, L) :- bag(X, (P & Q), L). ^^^^^^^ and de_morgan((P & Q), (R | S)) :- de_morgan(P, R), de_morgan(Q, S). ^^^^^^^ The first is code, and the treatment of it in the grimoire is appropriate. (That is, it will be mapped to whatever "(and ?P ?Q)" would have been mapped to in the BSI Lisp-like syntax.) But the second is data, and the treatment of it in the grimoire is NOT appropriate. It will be mapped to whatever "(and ?P ?Q)" would have been mapped to in the BSI Lisp-like syntax, but it SHOULD be mapped to whatever "[& ?P ?Q]" would be mapped to. If we consider a slightly different example: baz(X, P, L) :- bag(X, P, L). ^ and de_morgan(not(P), R) :- de_morgan(P, R). ^ we find the opposite problem: the second is data and will be mapped to whatever "?P" will be mapped to in the BSI Lisp-like syntax, but the first is code, and should be mapped to whatever "(and ?P)" would be mapped to, BUT IT WON'T BE. The trouble is that the grimoire tries to guess whether something is code or data by looking at its form, but that's the wrong place to look: the place to look is the predicate being called. And the trouble is that we can't build that information into the grammar, because the programmer can define new predicates with code-like arguments. Let me stress this: the whole basis of the build-it-all-into-the-syntax approach is the assumption that code is code and data are data and never the twain shall meet. This is true of Pascal. It is true of Fortran. It is almost true of C. But it is utterly false of Lisp and Prolog. A grammar of this type does not make SENSE for Prolog any more than it makes sense for Lisp. I hereby wager US$100, payable once to Chris Moss, that if the next draft of the grimoire attempts to maintain this rigid distinction between code and data, I will be able to find inconsistencies like the ones above in it. I don't think it's Chris Moss's fault: if anyone can find a way of working around this basic mistake (not HIS mistake, by the way, this is the kind of grammar the BSI committee have always wanted), I'm sure that Chris Moss could. I make my wager *despite* my belief in Chris Moss's competence, because I believe that it is _impossible_ for this approach to work. (If I do not receive said draft by the end of this year, the wager will expire.) > ',' and '&' AS OPERATORS > > Oddly enough, if one takes the grimoire literally, the user CAN > > declare ',' and '&' as operators, and can use them in that form. > > However, ',' and '&' cannot possibly have the same precedence as > > "," or "&" in BSI Prolog, and it seems clear that (A ',' B) and > > (A '&' B) must be different terms. > > It is not intended that it will be possible to declare ',' and '&' > as operators. > There is nothing in the grimoire to say so, and it is a very odd restriction. Intentions are beside the point: all that matters is what the documents actually say. It *is* the intention that it should be possible to write ','(A,B) as a term, and it remains the case that ','(A,B) and '&'(A,B) must be different terms, and if we take the grimoire literally, neither of them can be the same as (A,B) or (A&B). [Yes, I know about (P|Q) and (P;Q) in Dec-10 Prolog. I have always thought and said that this was a mistake, and I think it is one of the very few areas where a difference between the standard and existing practice might be justifiable. ] > QUOTE OPERATORS USED AS OPERANDS > > compare(R, X, Y) :- > > ( X @> Y -> R = > > > ; X @< Y -> R = < > > ; R = = > > ). > > Richard O'Keefe realizes that the above example is intended to be > syntactically incorrect in the standard. When operators are > used as operands, there many problems of possible ambiguity. > A cure is still under discussion, but some problems are > avoided by the rule that "An operator used as an operand must be > bracketted". > Well, it would be more accurate to say that I COMPLAIN that it is intended to be syntactically correct in the standard. There isn't any problem of possible ambiguity here whatsoever. ) :- ( :- must be infix X @> Y @> must be infix Y -> R -> must be infix R = > = must be infix or suffix, has no suffix reading = > ; > must be atom or prefix, has no prefix reading > ; X ; must be infix and so on Now if = and > _both_ had a suffix reading, (R = >) would be ambiguous. Since neither of them has, there is no ambiguity here at all. The elimination of ambiguity is not a very good argument for breaking existing UNAMBIGUOUS code! > NEGATION > > not Goal :- % "not" is not a built-in operator > > ( ground(Goal) -> \+ Goal % neither is "\+". > > ; signal_error(instantiation_fault(Goal,0)) > > ). > It is intended that Standard Prolog will not contain 'not' or '\+'. > Standard Prolog will not require systems to implement true > logical negation and it would be misleading to include an > operator or predicate that implies that they have done so. > Instead the way is left open for processors to implement a version > of 'not' as an extension and still remain standard conforming. > Standard Prolog will contain a built-in predicate > that implements 'negation by failure', i.e. > fail_if(G) :- call(G), !, fail. > fail_if(_). My main point here was a semantic one. Most other control structures are defined in the grammar. It seems odd that ( G -> fail ; true ) should be in the grammar, but that fail_if(G) which is identical in effect, should not. Because one of these forms is in the grammar and the other isn't, they have different properties. For example, ( 1 -> fail ; true ) is syntactically illegal, but fail_if(1) is syntactically legal. There are other differences as well. If BSI Prolog contains fail_if/1, then it WILL contain '\+', but with a different name. Why not use an existing name for an existing operation? Looks to me like nonhicinventusitis. \+ is a crossed-out |-, meaning, obviously enough, "not provable". > A program that resolves ambiguity implicitly is not acceptable as > defining a standard; there must be further definition. > One reason is that a program specifies too much. Some features need to > remain 'implementation dependent' because we must not specify > them, for example: the accuracy and largest values of floating point > numbers, or the integer value corresponding to a character. > > Another reason is that it is harder to understand and find errors. It is harder to understand and find errors in a program you can run than in a never-used-anywhere-else formalism? Judging by the results, this is the opposite of the truth. What is the difference between the public-domain DEC-10 Prolog parser and the BSI grimoire? Both are programs, in a formalism based on logic. Neither is more explicit or less explicit than the other, and both are of similar size. So what is the difference? The difference is that the public-domain DEC-10 Prolog parser CAN be run, HAS been run, and has had most of the mistakes knocked out of it by actual experience. The BSI grimoire is in a new formalism, the definition of which is provided in ***NO*** BSI document (so that I had to keep guessing what things meant), and each of the three drafts I have seen was riddled with errors from end to end. I haven't told you about all the problems I found; there are nearly as many problems as rules! The BSI Prolog group HAVE specified the integer value corresponding to a character: they require the ISO 8859 character set. GREAT! The DEC-10 public-domain ***parser*** does NOT specify the integer value corresponding to a character (that's the tokeniser's job). {The old tokeniser did have ASCII codes built in, but the current version of the tokeniser uses 0'x syntax for the appropriate constants to avoid that problem.} If the BSI committee are so concerned to avoid character code problems, how come they haven't got anything like 0'x or `x` (in a standard which doesn't have to cope with existing code that uses ` as an atom, `x` is a good notation for character code constants)? The public-domain tokeniser doesn't specify anything more about floating point numbers than what they look like, it relies on being provided with a number_chars/2 predicate (which we want ANYWAY) do to the actual conversion. Note that the BSI grimoire says NOTHING about what happens if you write a constant which exceeds the capacity of your implementation. Is the program p(1.2e3456). a BSI-conforming program or not? Well, syntactically it is, but the lexical rules say nothing about what it MEANS. For all that the grimoire or any other BSI document I can recall says to the contrary, a Prolog implementation which reads this as p(0.0). is conforming. This kind of thing is a real portability problem; it exists with respect to integers too. Is 1000000000000000000 a legal Prolog term? According to the grimoire, yes. What does it mean? The grimoire doesn't say. > DISCLAIMER AND CONCLUSION > Never rely on working papers and draft standards. They are subject to > changes and review. All documents and working papers, however > confidently expressed, are also subject to review. There will be no > standard until the member bodies of ISO have approved it. But what ELSE is there to comment on? > Many countries, but not at present USA, have national Prolog panels > coordinating their views on the emerging standard. I encourage all > Prolog implementers and users to participate in this effort in order that > the eventual standard is one that preserves the best of the past > and also provides development paths for the future. > > Roger Scowen, 11 March 1988 Sorry, but it's too late. Prolog implementors and users should have been invited to contribute before the committee went on a four-year binge of inventing their own language. I explicitly suggested some years ago that the people at WISDOM should be invited to participate, and was told that that was out of the question. I have put a lot of effort into writing responses to the BSI stuff, and for all the feedback I've had I might as well have been shouting into a vacuum. The BSI committee having been resolute in their contempt for existing Prolog users (I have repeatedly urged that they should explicitly adopt a principle of not breaking existing code without strong necessity, as the ANSI C committee did, and the last I heard was that they had explicitly rejected any such idea), I cannot regard "preserves the best of the past" as anything but a sick joke. Look, if you want to preserve the best of the past, why have you renamed findall/3 to bag/3? Why have you adopted ESI Prolog-2's streams rather than Arity/Prolog's streams, despite having been told about the problems? Could it be something to do with the fact that the author of that part of the standard worked for ESI, not for Arity? Why have you dropped nl/0 from the standard? Why is there no notation for character constants such as PopLog provides? Why is the error handling facility all new, rather than resembling either IF/Prolog or M-Prolog? I have tried, I really have tried, to arouse interest in the BSI work here in the US. Do you know what has got in the way? As soon as I show people any of the BSI documents (take the 'standardisation issues' documents as an example) they say "what a pack of turkeys" and assure me that there is nothing to worry about. I remain desperately worried that there will be a BSI/ISO Prolog standard, and that it will be as bad as the current drafts, and that it will do a great deal of damage. What *really* worries me is that the people on the BSI committee don't seem to realise how bad it is.
cdsm@ivax.doc.ic.ac.uk (Chris Moss) (03/25/88)
In article <797@cresswell.quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes: >In article <234@gould.doc.ic.ac.uk>, cdsm@doc.ic.ac.uk (Chris Moss) writes: >> Forwarded for Roger Scowen -- KRG0@gm.rl.ac.uk >> >> RESPONSE TO COMMENTS FROM RICHARD O'KEEFE ON PROLOG STANDARDIZATION I have forwarded your response to Roger, (which is who it was from, for the unobservant) and will only answer the portion addressed directly to me. >My postings were in fact a response to Chris Moss's mailing. They were >not confined to the content of that mailing, true. It seemed to me that >Chris Moss's mailing implied that the BSI syntax was in a satisfactory >state, and that it wasn't as difference from the de facto standard as >people feared. I set out to show that neither of those statements is >true, and I believe that I succeeded. > I think you certainly proved BSI syntax was not in a satisfactory state tho I don't think I ever claimed it was. Who knows what people feared? >Many comments did refer to a document that most news readers won't have >seen. I've sent copies of the syntax to those readers who asked me for it and am quite prepared to do so for any others. Please send a snail-mail address, not e-mail. >Consider the difference between > > foo(X, P, Q, L) :- bag(X, (P & Q), L). > ^^^^^^^ >and > de_morgan((P & Q), (R | S)) :- de_morgan(P, R), de_morgan(Q, S). > ^^^^^^^ >The first is code, and the treatment of it in the grimoire is appropriate. ... >I hereby wager US$100, payable once to Chris Moss, that if the next draft >of the grimoire attempts to maintain this rigid distinction between code >and data, I will be able to find inconsistencies like the ones above in >it. Oh dear, I forsee difficulties. I regard both of those as DATA. The present syntax does. If we're going to have to agree on where the dividing line is, it won't be very fruitful for me. Or is there a neutral judge? Is that your point? Chris. . . . . . . . . . .
ok@quintus.UUCP (Richard A. O'Keefe) (03/28/88)
In article <245@gould.doc.ic.ac.uk>, cdsm@ivax.doc.ic.ac.uk (Chris Moss) writes: > >Consider the difference between > > > > foo(X, P, Q, L) :- bag(X, (P & Q), L). > > ^^^^^^^ > >and > > de_morgan((P & Q), (R | S)) :- de_morgan(P, R), de_morgan(Q, S). > > ^^^^^^^ > >The first is code, and the treatment of it in the grimoire is appropriate. > ... > >I hereby wager US$100, payable once to Chris Moss, that if the next draft > >of the grimoire attempts to maintain this rigid distinction between code > >and data, I will be able to find inconsistencies like the ones above in > >it. > Oh dear, I foresee difficulties. I regard both of those as DATA. The > present syntax does. If we're going to have to agree on where the > dividing line is, it won't be very fruitful for me. Or is there a > neutral judge? > Is that your point? What matters is not what I regard these examples as, nor yet what Chris Moss regards them as, but what the actual text of the standard says. I have in my possession three drafts of BSI Prolog syntax. In each of them, anything of the form X & Y is code, not data. I could cite the relevant rules, but if you've kept the messages so far you'll have one where I cite a rule mentioning 'and_symbol', and there is only one rule which uses 'and_symbol', and the abstract representation it constructs is a CODE (sys(and,[~P~,~Q~])) representation, not a data one (which might be func('&',[P,Q]) or func(',',[P,Q]), or it might not). -- honesty attack: one of the drafts doesn't actually allow -- (P & Q) as an argument at all, but that seems to be a mistake. -- I'm talking about the interpretation it would receive if it -- were allowed. I am rather surprised that Chris Moss regards the (P & Q) in bag(X, (P & Q), L) as *data*. It is functionally no different from the (P & Q) in ((P & Q) & fail | true), which I presume is to be regarded as code. If a BSI Prolog compiler were to compile the argument of fail_if/1, or the 2nd argument of bag/3, the programmer would have no way of telling. Since the whole point of distinguishing between code and data in the syntax is alleged to be that this permits better error messages, the fact that an arbitrarily large chunk of code (the second argument to bag/3, or the first argument to fail_if/1 or once/1) may be misread as data (if Chris Moss is right) and so may provoke quite the wrong sort of error message means that the attempt has failed. I don't agree that we "have to agree on where the dividing line is". My point is that there ISN'T any such dividing line, and that any attempt to impose a such a dividing line on Prolog will resemble Solomon's judgement, without, I fear, its happy consequence. I realised after sending my original message proposing the wager that I should suggest judges. My first idea was to suggest Udi Shapiro Ken Bowen Seif Haridi if they would be willing, but then I realised that each of them is associated with a Prolog system which has not been allowed to influence the BSI work and is as incompatible with BSI Prolog as DEC-10 Prolog or more so, so they might not be accepted as impartial. (I'm sure they would be, but it would be best not to put them in such a position.) So how about John Lloyd, if he's willing? Ok, he was at Melbourne, and Melbourne have an Edinburgh-compatible Prolog called NU Prolog, so the BSI standard will hurt them, but he isn't at Melbourne any more, and would welcome something which was closer to logic. Definition: if the BSI grimoire assigns a term an abstract interpretation whose principal functor is sys/2, that is a CODE reading. If it assigns an abstract representation which is a variable, a constant, or a compound term whose principal functor is func/2, that is a DATA reading. This applies to the drafts I have discussed in this newsgroup. A different criterion may be needed for the draft to which the wager applies. Explanation: the BSI grimoire assigns to data terms an abstract representation which embeds the space of terms into the space of terms in such a way that two terms unify if and only if their abstract representations unify. It assigns to code terms an abstract representation which, taken literally, is quite different from any data term. The intent is, I believe, to avoid overspecifying the representation of the control structures (e.g. whether (a,b) is stored as ','(a,b), '&'(a,b), or even perhaps [AND (a) (b)]. Thus the syntactic difference between code and data MAY show up as a difference in what you actually get, or it may not. To win my wager, I have to show that the next draft of the grimoire assigns an inappropriate kind of representation in one example, or that it is ambiguous with respect to which kind of representation you get at some point. The ambiguity or inappropriateness is to be determined by reference to the abstract representation only. There is a BSI document which defines consult/1 by listing code for it, which clearly and explicitly calls read/1. Not being on the committee, I haven't read all the documents as carefully as I might, so they might have abandoned this link between code syntax and read/1. By the way, it's a one-way wager: if I lose, I pay Chris Moss $100, if I win, he pays me nothing.
ok@quintus.UUCP (Richard A. O'Keefe) (03/29/88)
I have been sending messages which are rather hard on BSI Prolog syntax. Somewhat belatedly, I'd like to try to put these comments in perspective. 1. I believe that an attempt to force a rigid distinction between code and data is misguided in *any* Prolog dialect, because that distinction does not correspond to anything real in the language. I stand by my one-way US$100 wager. One of the BSI documents in my possession makes it clear, by showing Prolog code purporting to be a definition of consult/1, that consult/1 is to be based on read/1. 2. If BSI Prolog were just another dialect, or if BSI Prolog provided means whereby a programmer could install another definition of read/1 (such as the public-domain DEC-10 Prolog parser, or something that uses VM/PROLOG syntax, or something that uses BIM native syntax, or whatever), then a fair assessment of the intended syntax (as distinct from the grammar) would be "a pretty good compromise". In fact, I'd say that it is easily the best of the current fragments. I have complained about differences between BSI syntax and DEC-10 Prolog syntax. A number of things in DEC-10 Prolog syntax have been unpopular or misunderstood, and many of the changes are well motivated. But they are changes which will be imposed on Prolog users without their consent. It doesn't matter how good the BSI syntax is (and it has merits), if Prolog users *have* to use it, it will COST them. 3. BSI Prolog provides a Prolog-like syntax and a Lisp-like syntax (though I do not yet know how you say which one you want). Neither of these syntaxes, despite their merits, is sufficiently compatible with DEC-10 Prolog to read many of my files. That is, anyone who has existing DEC-10 Prolog, ALS Prolog, Arity/Prolog, &c code and wants to use a BSI conforming processor will apparently be forced to make many changes to their files if they want to use the Prolog-like syntax. This is why the strictest attainable backwards compatibility is so important. If BSI processors *offered* a standard syntax rather than *imposing* it, and provided a standard way to select syntaxes additional to the Prolog-like and Lisp-like ones, then vendors could offer compatibility modules to help their customers convert to the standard. As an example, consider Common Lisp. The native syntax of Common Lisp is differs in a great many details from the syntax of Interlisp, yet on the Xerox Lisp Machines the two dialects use the same reader, with the differences being concentrated in read-tables. I like Common Lisp syntax, someone else here likes Interlisp, and we can both use the syntax of our choice. (Even for the same files...)
kobryn@esosun.UUCP (Cris Kobryn) (03/31/88)
In article <834@cresswell.quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes:
Path: esosun!seismo!uunet!husc6!mailrus!ames!pasteur!ucbvax!decvax!decwrl!sun!quintus!ok
From: ok@quintus.UUCP (Richard A. O'Keefe)
Newsgroups: comp.lang.prolog
Date: 29 Mar 88 11:54:38 GMT
Organization: Quintus Computer Systems, Mountain View, CA
Lines: 47
I have been sending messages which are rather hard on BSI Prolog syntax.
Somewhat belatedly, I'd like to try to put these comments in perspective.
. . .
As an example, consider Common Lisp. The native syntax of Common Lisp is
differs in a great many details from the syntax of Interlisp, yet on the
Xerox Lisp Machines the two dialects use the same reader, with the differences
being concentrated in read-tables. I like Common Lisp syntax, someone else
here likes Interlisp, and we can both use the syntax of our choice. (Even
for the same files...)
Common Lisp argues against your vantage as well as for it. In order to maintain
strong backward compatibility (mostly MacLisp derivatives) Common Lisp
compromised the syntax and semantics of the language significantly. Consequently,
many Lispers ruefully think "what might have been" if the Common Lisp
committee has paid less attention to backward compatibility and more attention
to syntactic elegance. If more of the lessons learned from Scheme were incorporated
(cf. EuLisp) the Lisp community would be much better off.
There are obvious tradeoffs between backward compatibility and syntactic
elegance. However, I don't think that the conversion problems you fear
being caused by the BSI standard are especially scary. Automatic conversion
programs are no panacea, but _can_ (in moderation) be useful here.
With the perspective of several years it seems clear that Common Lisp
overemphasized compatibility at the cost of elegance. BSI seems to be
doing a much better job of this.
-- Cris Kobryn