[comp.lang.prolog] Prefix operators and blanks

pgl@cup.portal.com (Peter G Ludemann) (12/04/89)

N40 says that "Any layout text before the term is ignored.  A
read term ends with the end character followed by a layout text
i.e. a layout char or a comment."   It goes on to say that
"[there] shall be no layout text between the functor and the
opening bracket of a compound term if the functor is a prefix
operator".  Please correct me if I'm wrong, but doesn't `p (x)'
produce a syntax error on Edinburgh Prolog if `p' isn't a
prefix operator?   N40 would seem to allow this.

N33 states (under "Prolog syntax problems" (2): "spaces should
be as insignificant as possible" and refers to N9 which says
that a space is not used to distinguish between a functor and a
prefix operator (I don't have N9 handy right now, but I did
look this up once).

Richard O'Keefe has argued that blanks are significant in
English and they should be significant in Prolog.  This
argument uses a false analogy.  Punctuation characters are used
differently in English than in programming languages -- even
COBOL doesn't use English rules for hyphens and parentheses.
Instead, we should work with a "principle of least surprise",
using other programming languages as a reference.  Edinburgh
Prolog is unique among programming languages in making a
distinction between `p(a,b)' and `p (a,b)'.  [Please don't tell
me about some macro processors which distinguish between these
two cases -- most macro processors have a syntax which can be
at best described as painful.]

If one decides to define 'not' as a prefix operator and also
chooses to use it with as a functor (e.g. `not(a,b)'), one is
asking for trouble and should instead disambiguate by extra
quote marks or parentheses: 'not'(a,b) or not((a,b)).

I therefore suggest that the Prolog standard specifically
disallow using a prefix operator as a functor.  The use of
quotes and extra parentheses allows clear disambiguation.  This
will break a few programs, but in an acceptable way (i.e., they
will produce easily fixed syntax errors, rather than be
accepted silently in an unexpected way).


- Peter Ludemann      pgl@cup.portal.com
                      ...!sun!portal!cup!pgl

   --- my opinions are my sole responsibility, of course ---

lee@munnari.oz.au (Lee Naish) (12/06/89)

In article <24703@cup.portal.com> pgl@cup.portal.com (Peter G Ludemann) writes:
>I therefore suggest that the Prolog standard specifically
>disallow using a prefix operator as a functor.

I think it is very important that there is a canonical way to write a
term so it can always be read back in, no matter what the operator
precedences are.  Peter's suggestion seems to break this.

	lee

matsc@sics.se (Mats Carlsson) (12/06/89)

In article <24703@cup.portal.com> pgl@cup.portal.com (Peter G Ludemann) writes:

   If one decides to define 'not' as a prefix operator and also
   chooses to use it with as a functor (e.g. `not(a,b)'), one is
   asking for trouble and should instead disambiguate by extra
   quote marks or parentheses: 'not'(a,b) or not((a,b)).

   I therefore suggest that the Prolog standard specifically
   disallow using a prefix operator as a functor.  

But what about prefix operators that are atoms that need quoting?
Disallowing them altogether seems a rather severe restriction.  If
they are allowed, quotes cannot be used for the disambiguating rule in
the first paragraph.  Are you suggesting that an extra pair of
parentheses would be required in that case?

Peter's suggestion would have an impact on the canonical syntax of
Prolog.  I don't have the WG17 draft handy, but I can't imagine that
it fails to define a canonical syntax which is independent of any
operator declarations that happen to be declared, so that a printed
term can always be read back in.  Many Prologs provide
write_canonical/[1-2] as built-in predicates exactly for this purpose.

With Peter's suggestion ALL functors would have to be printed with
quotes in the canonical syntax, e.g.
	'f'(1,2).
to cater for the case that it might be read in with 'f' declared as a
prefix operator.

The idea of letting quotes influence how terms are parsed seems to
blur the distinction between lexical and syntactical analysis.  It
seems cleaner to treat quotes strictly at the lexical level.  The only
other language where "atoms" can be quoted that I know of is LISP, but
LISP does not have user defined operators, so it's hard to find an
analogy to justify Peter's suggestion.

Another way of avoiding the ambiguity with `not(a,b)', `not (a,b)',
...  (assuming spaces before the `(' are no longer significant) would
be to simply disallow the use of `,' as an infix operator.

--
Mats Carlsson
SICS, PO Box 1263, S-164 28  KISTA, Sweden    Internet: matsc@sics.se
Tel: +46 8 7521543      Ttx: 812 61 54 SICS S      Fax: +46 8 7517230

pgl@cup.portal.com (Peter G Ludemann) (12/07/89)

The only canonical way of writing out a compound term so that
it can be read in without regard to operator definitions is to
quote everything and use no operator notation.  For example:

	a+b		===>	'+'('a','b')
	:- (a,b)	===>	':'(','('a','b'))

Why all the quotes?  Well, someone could have defined `a' to be
an operator and therefore '+'(a,b) would get a syntax error.

What I am suggesting is that if `+' is an operator, then you will
be forbidden to write in your program:

	+(a,b)		or
	+ (a,b)

You must write instead:

	'+'(a,b)	or
	'+'((a,b))	etc.

(BTW, there is no ambiguity if comma is not an operator.  That is
the one big advantage of the Waterloo syntax - `&' is used instead
and comma is only used as an argument separator.  But I am NOT
suggesting deviating that far from Edinburgh syntax.)

- peter ludemann	pgl@cup.portal.com   sun!portal!cup!pgl

	--- standard disclaimer ---

matsc@sics.se (Mats Carlsson) (12/08/89)

In article <MATSC.89Dec6095053@vishnu.sics.se> matsc@sics.se (Mats Carlsson) writes:

   Another way of avoiding the ambiguity with `not(a,b)', `not (a,b)',
   ...  (assuming spaces before the `(' are no longer significant) would
   be to simply disallow the use of `,' as an infix operator.

I would like to retract this comment.  As John Dowding at UNISYS has
pointed out, it would either require a difference between the syntax
of clauses and the syntax or terms, or make clauses look extremely
awkward.
--
Mats Carlsson
SICS, PO Box 1263, S-164 28  KISTA, Sweden    Internet: matsc@sics.se
Tel: +46 8 7521543      Ttx: 812 61 54 SICS S      Fax: +46 8 7517230

micha@ecrcvax.UUCP (Micha Meier) (12/08/89)

In article <24703@cup.portal.com> pgl@cup.portal.com (Peter G Ludemann) writes:
>If one decides to define 'not' as a prefix operator and also
>chooses to use it with as a functor (e.g. `not(a,b)'), one is
>asking for trouble and should instead disambiguate by extra
>quote marks or parentheses: 'not'(a,b) or not((a,b)).
>
>I therefore suggest that the Prolog standard specifically
>disallow using a prefix operator as a functor.  The use of
>quotes and extra parentheses allows clear disambiguation.

The use of quotes to hide the operator precedence is one of the weakest
points introduced by BSI. The quotes are used both to make an atom
and to disable the operators, although parentheses are clearly
the right way to this. The use of these two must not be deliberate,
quotes should be used only to make an atom out of a sequence of characters
that would normally not be recognized as an atom, and parentheses are used
to denote a term, hence also the token 'ATOM'.

If one decides to forbid prefix operators as functors, then
clearly not(a, b), (not)(a, b) and 'not'(a, b) are wrong, and so
only not((a, b)) should be possible.

However, forbidding prefix operators as functors does not seem to
me as the best choice, I would say that forbidding, say, not(p(a))
is worse than making not(a, b) and not (a, b) different.
It is true that the former is only boring because you have
to edit all the places which are wrong, but the parser
will report all of them, whereas with the latter
the user may end up by calling a wrong procedure and then he has to use
the debugger to find it (unless the procedure is undefined).
Still, it is a good programming style to use the parentheses
when using terms with main functor ',' no matter if the
predicate is an operator or not and so I think all the good guys
should not be forced to use parentheses just because of
some bad guys that use bad style and ask for troubles anyway :-)

This is what we implemented in Sepia, spaces are not significant
except between a prefix operator and the opening parnthesis, and so far
we got no complaints because of this.

--Micha Meier

aarons@syma.sussex.ac.uk (Aaron Sloman) (12/09/89)

pgl@cup.portal.com (Peter G Ludemann) writes:

> Date: 7 Dec 89 07:00:13 GMT
>
> The only canonical way of writing out a compound term so that
> it can be read in without regard to operator definitions is to
> quote everything and use no operator notation.  For example:
>
> 	a+b		===>	'+'('a','b')
>   :- (a,b)    ===>    ':'(','('a','b'))
>
> Why all the quotes?  Well, someone could have defined `a' to be
> an operator and therefore '+'(a,b) would get a syntax error.

etc.

and
References: <24703@cup.portal.com> <MATSC.89Dec6095053@vishnu.sics.se>
matsc@sics.se (Mats Carlsson) writes:

> Date: 6 Dec 89 08:50:53 GMT
> Organization: Swedish Institute of Computer Science, Kista
    [.... portion omitted.....]
> ................ The only
> other language where "atoms" can be quoted that I know of is LISP, but
> LISP does not have user defined operators, so it's hard to find an
> analogy to justify Peter's suggestion.

Pop-11, like Lisp, allows atoms to be quoted, and does have user
defined operators (and built in ones).

In the original Pop-2 as far as I recall, and certainly in all the
derivatives that I have used (POP-10, WPOP, and POP-11) an "infix"
operator can also be used in the standard prefix position for a
function. It can also be used in postfix position. E.g. all the
following are equivalent in Pop-11, (partly because of the way in
Pop uses a stack for passing arguments and results).

    x+y
    +(x,y)
    (x,y)+
    x,+y
    x,y,+
    x,y,+()
    nonop+(x,y)
    valof("+")(x,y)     ;;; this one is less efficient

(Note: I have relied on the fact that "+" and alphabetic characters
are in different lexical classes. If "x" and "y" were replaced by
identifiers composed of "sign" characters like ":", "=", "#",
"+",etc, then spaces would be required as delimiters to separate
them from "+".)

Essentially, being an operator in Pop simply means that
    (a) the identifier cannot be given any value other than a
        procedure, so that run-time checks are not needed
    (b) the invocation of the procedure does not REQUIRE the use of
        following parentheses as for ordinary (non-operator)
        procedure calls, e.g. max(x,y), min(x,y)
    (c) the operator has a precedence (so that x*y+z=k does what you
        expect, where "+", "*", and "=" are operators with different
        precedences)
    (d) if you want to refer to the procedure without executing it
        you have to precede the identifier with "nonop", or, less
        efficiently, use valof, as above.

Of course, in Pop operators are not used to create structures as in
prolog (though they can have as values procedures that when
executed create structures, e.g. "x :: y" creates a list whose head
is x and whose tail is y because "::" is an infix name for CONS).

Aaron Sloman,
School of Cognitive and Computing Sciences,
Univ of Sussex, Brighton, BN1 9QH, England
    EMAIL   aarons@cogs.sussex.ac.uk
or:
    INTERNET: aarons%uk.ac.sussex.cogs@nsfnet-relay.ac.uk
              aarons%uk.ac.sussex.cogs%nsfnet-relay.ac.uk@relay.cs.net

cdsm@sappho.doc.ic.ac.uk (Chris Moss) (12/13/89)

In article <24703@cup.portal.com> pgl@cup.portal.com (Peter G Ludemann) writes:
>If one decides to define 'not' as a prefix operator and also
>chooses to use it with as a functor (e.g. `not(a,b)'), one is
>asking for trouble and should instead disambiguate by extra
>quote marks or parentheses: 'not'(a,b) or not((a,b)).
>
>I therefore suggest that the Prolog standard specifically
>disallow using a prefix operator as a functor.  The use of
>quotes and extra parentheses allows clear disambiguation.  This
>will break a few programs, but in an acceptable way (i.e., they
>will produce easily fixed syntax errors, rather than be
>accepted silently in an unexpected way).

The history of this within BSI is as follows:

In PS/154 (a BSI document) I put forward two possible syntaxes. One (close to)
the normal Edinburgh syntax, the other close to Peter's suggestion, where
quotes are used to distinguish prefix operators. 

Inasfar as it received sensible discussion it was vetoed on the grounds that
have been expressed by Lee Naish and Mats Carlsson. In particular having a
normal form for terms is nice, and you don't want always to quote things.
But the biggest argument is that many people (but NOT all) have got used to
doing it the Edinburgh way. _I_ think it's rather ugly, but I don't see that
cleaning things up for its own sake is part of the standardisation process if
what's there already works.

So in N9 (PS/239) I scrapped that and try to do it Edinburgh way and also
supply a set of rules that would dismbiguate any ambiguous expressions arising
out of using different operator declations at the same level. The attempt
didn't work and that was the last contribution I made before resigning from the
BSI/ISO committee in despair. Several people, including Tony Dodd and Hamish
Taylor have made proposals to cope with this. The set of rules in N40 is as far
as I can see a variant on N9 that doesn't do much better.

The normal suggestion from Edinburgh is "take a parser and standardize that",
the normal candidate being Richard O'Keefe's read predicate in the Edinburgh
library. However when it comes to ambiguous expressions, and which one is
taken, then different "Edinburgh" Prologs do vary. I'll dig up my examples if
anyone wants them. And as a methodolgy for standardization it is about as
undeclarative as you can get. 

Enough history!

Chris Moss.

pgl@cup.portal.com (Peter G Ludemann) (12/15/89)

It's precisely the kinds of problems which Chris Moss mentions
which prompted my posting.

If the Edinburgh public-domain parser is unambiguous, then
we ought to have a precise grammar (BNF, W-grammar or whatever)
which describes the language it generates.  The current N40
draft uses some formal notation and a fair bit of English 
(I discovered that I was looking in the wrong place for the
rules on prefix operators and blanks) but there is still lots
of ambiguity (I've already given some examples in an earlier
article).

From the various postings to the net, it is clear that there are
many different philosophies (and implementations) of the bits
which are not precisely defined in Clocksin&Mellish or other
reference books.

So, I suggest one of the following be adopted:

1. Enshrine the public-domain Edinburgh parser (where can
   I get a copy, by the way?).  This would certainly be
   novel (I wonder if ISO would accept it?).

2. Take the current grammar and make sure that the English
   comments are well-organized and precise (this would also
   be novel; every language standard that I have read has
   somewhere had a formal description of the grammar).

3. Make an unambiguous grammar.  This is not easy, but I
   think that it is necessary.  The job can be made much
   easier by disallowing most of the current ambiguities
   (they probably aren't used much; and eliminating them
   gives the parser a better chance at generating meaningful
   errors).  

   Some of things which would be disallowed are:
	operators used as ordinary atoms (they must be
		either quoted or inside parentheses).
		Thus, `f(+,-)' would be illegal but
		`f('+',(-))' would be legal.
	right-to-left and left-to-right operators of the
		same priority juxtaposed.
	similarly with the various cases of prefix/infix/suffix
		which I gave in an earlier note.

[Incidentally, there is nothing wrong with the lexical analyzer
distinguishing between quoted-atom and unquoted-atom; this makes
the grammar a bit bigger (every case of "atom" must be replaced
by "quoted-atom | unquoted-atom", except in a few places such as
prefix operators).  As it, there is some tie-in between the
parser and the lexical analyzer, for example with `-1' (Quintus
has unexpected behaviour with `-1' and `- 1', by the way, not
documented anywhere that I could see).]

No matter what is chosen, some programs will break.  As to
parsers breaking -- well, a few days' work should take care of
that (I have written 3 parsers and speak from experience).  We
MUST have an unambiguous grammar and we MUST make sure that when
a program is taken from one implementation to another, the second
implementation does not silently change the program's meaning.

- peter ludemann	--- my opinions are my own responsibility ---