[net.ai] Syntax Semantics

O'KeefeHPS@sri-unix.UUCP (07/03/84)

From:  O'Keefe HPS (on ERCC DEC-10)

     [Forwarded from the Prolog Digest by Laws@SRI-AI.
     This is part of an ongoing discussion of a proposed
     Prolog transport syntax.]

I agree that S-expressions are elegant, general, and so on.
{So for that matter are Goedel numbers.}  But there is good
reason to believe that they are not well suited to human
reading and writing.  That is that ***LISP*** doesn't use
S-expressions for everything.  Consider

 [A<-(FOR NODE IN NODELIST COLLECT CAR EVAL (NODE:NODEFORM]

which is perfectly good InterLisp.  I might be persuaded
that this is readable, though not after the uglyprinter has
had its way with it, but it quite certainly is NOT an S-expression.
Now there is an isolationist section of the Lisp community which
ignores Interlisp (the Common Lisp designers did not, well done).
But as soon as you use ANY sort of read macro, you have left
S-expressions behind.  `(cons ,(cadr form) ,(caddr form)) is not
an S-expression.  And of course the MacLisp family has (loop ...).

As for the question about whether there are natural binary operations
which are not associative, mathematics is FULL of them.  Any function
whose result type differs from its right argument type.  Quite a lot
are relations, such as "=", "<", and so on.  Note that the extension
of the relational operations to multiple arguments in some Lisps
loses you a lot of nice properties: (>= A B C) is not the same as
(NOT (< A B C)).  Depending on implementation subtleties it might not
even be the same as (<= C B A).  And the extension STILL doesn't let
you write 0 <= X < Y, you have to write -1 < X < Y which is often more
obscure than an explicit conjunction would have been.  The principal
example of a non-associative binary operator is exponentiation.
As an instance of exponentiation, let's take the free monoid over
the ASCII characters, where "abc"^4 = "abcabcabcabc".  Any
generalisation of this to take an arbitrary number of arguments will
not treat all the arguments equally.  One natural extension to
multiple arguments would be to take an alternating list of strings
and integers (^ S1 N1 ... Sk Nk) = (concat (^ S1 N1) ... (^ Sk Nk))
with omitted integers taken to be 1.  But this doesn't place all the
arguments on an equal footing (what should (^ "abc" 2 3) mean?) and
doesn't really generalise to other operators.  There is even a
problem with operators which have coincident argument and result
types.  For example, why should (CONS x y z) means (CONS x (CONS y z))
rather than (CONS (CONS x y) z)?

I'm afraid the Sigma notation is not an example of mathematicians
retreating in response to an indaequacy in infix notation.  When
writing *informally*, mathematicians are as pleased as anyone else
to write a1 + ... + an.  Sigma notation specifies three things:
a commutative associative binary operation (if the domain can be
empty the operation must have an identity), a domain, and a
FUNCTION.  E.g. in
                /\
               /  \     f(i) = g(i)^2
              i=1..n
the operation is "and", the domain is {1,2,...,n}, and the
function is lambda i. f(i) = g(i)^2.  I fully agree that this is a
good thing to have, but it is NOT a thing you can get by allowing
the argument list of everything in sight to have an arbitrary number
of arguments.  The number of arguments is still fixed at each CALL.
When the number of operands is unknown, the Lisper still has to write

        (apply #'and (mapcar #'(lambda (i)
           (eql (f i) (expt (g i) 2))) (num-list 1 n)))

and then finds that he can't, because "and" is a special form and
you can't apply special forms.  Quite a few Lisps macro-expand
(plus a b c d) to (plus2 (plus2 (plus2 a b) c) d), and you can have
fun applying #'plus !  What you find yourself having to do all too
often is

        (eval (cons #'and (mapcar #'(lambda (i)
            (kwote (eql (f i) (expt (g i) 2)) )) (num-list 1 n)) ))

Now there IS a lisp-like language which has faced up to this problem
of some functions sensibly taking an arbitrary number of arguments.
It's called 3-Lisp.  Yes, I mean the thing described in Brian Smith's
thesis.  It has been implemented and runs quite nicely on Xerox 1108s.
3-lisp distinguishes between PAIRs, written (car . cdr) and SEQUENCES,
written [a1 ... an].  I shan't tell you about the distinction between
sequences and rails.  There *is* a notational convention that if you
write (car a1 ... an) {note: no dot after the car} it means
(car . [a1 ... an]).  So (+ a b c) means (+ [a b c]).  The thing is
that if you employ the language's reflective capabilities to poke
around in the guts of your program while it's running, you'll find
that the rail or sequence really is there.  The cdr of a pair can be
any form (as can the car), so if I want to write
        (+ . (map f L))
I can.

Given that Prolog is based on relations rather than functions,
you don't find anywhere NEAR as much nesting as you do in a
language based on functions, so the operator priority issue doesn't
really arise, except when the data you are working on are expressions
in some language.  MACSYMA, for example, exploits the CGOL parser
to handle its input.  Prolog happens to use the analogous thing all
the time.

Prolog lets you use operators when you want to.  You don't have
to use any operators at all if you don't want to:
        :-(max_min(X,Y,X,Y), >=(X,Y)).
        :-(max_min(X,Y,Y,X),  <(Y,X)).
is quite acceptable to the Prolog parser, though not to me.
Similarly, Lisp lets you use operators if you want to (in InterLisp
it happens automagically with CLISP, use the CLISPTYPE, UNARYOP, and
LISPFN properties; in PSL you can use RLISP; in MacLisp there is
CGOL), but normally only uses a handful of strange syntactic forms.

Prolog and Lisp are similar internally as well.  Modulo all the
exotic data structures like records and hash tables present in all
modern Lisps, which structures have no S-expression representation
{Common Lisp does let you use #(struct-name field-1 "val1" ...)
for defstructs, but there is nothing for hash tables or most forms
of array}, most Lisp data can be viewed as
        atomic
or      (CONS something something-else)
and functions to hack such structures are indeed easily written.
If you want to write programs that analyse programs, you rapidly
find yourself in very deep water sinking fast.  It is a MAJOR
contribution of Common Lisp that they have spelled out exactly
what special forms a program-hacking tools must understand (there
are about 20 of them) and that they have specified ways of telling
whether a form is a macro call and of getting at the macro expansion
without too much pain.  The hair in the MASTERSCOPE interface for
coping with InterLisp's "no specifications please, we're power-
assisted programmers!" view is amazing.

Prolog code that wants to hack arbitrary data structures can
similarly view the world as divided into two sorts of objects:
        variables
and     other terms
Other terms have a function symbol and 0 or more arguments.
For example, 77 is a non-variable whose function symbol is 77
and which has 0 arguments.  This is admittedly a wee bit more
complex than basic Lisp, as cons cells are known to have exactly
two components.  But I have faithfully described real Prolog;
there is nothing else that a Prolog program can get its hands
on and look inside.  A Lisp |print| routine really does have
to worry about what to do with records (defstructs) and arrays
and compiled-code pointers and ...
The list of Prolog "special forms" is embarrassingly large:
        ?- /1   :- /1   :- /2   , /2
        ; /2    \+ /1   setof/3 bagof/3 call/1
        once/1  forall/2        % some Prologs lack these two
But that's still a LOT less than Common Lisp has.

To summarise:
        yes, S-expressions are general, clean, and hackable.
        BUT, so are terms.
        BUT, there is a LOT more to Lisp syntax and internal
             structure both than S-expressions.

Ken Kahn is absolutely right that it is important to have an
internal structure which you can easily build tools for.  He
is right that S-expressions are such an internal structure.
(Just don't assert any clauses mentioning arrays...)  It is
also true that "Core Prolog" data structures are themselves
convenient to work with.  (Some of the toy Prologs that you
can buy for peanuts get basic things like functor/3 wrong so
that it isn't true.  But there are several toy implementations
of Lisp which represent a list as a string, so that
        DEFINE('CAR(X)')
        CAR.PAT = "(" BAL . CAR "."     :ON
CAR     X CAR.PAT :S(RETURN)F(FRETURN)
Does that mean Lisp data structures are bad?)

Look, we're NEVER going to agree about syntax.  I've indulged myself
somewhat in this message, but I have been defending a kind of
syntax (infix operators) and to some extent even a choice of syntax,
rather than any specific form.  [...]

[O'Keefe went on to discuss his previously-presented proposal for
a Prolog transport syntax.  -- KIL]