[comp.lang.prolog] Determining order of argument unification

lang@zeta.PRC.Unisys.COM (Francois-Michel Lang) (11/04/88)

There are essentially two (reasonable) ways for a Prolog system to attempt
to unify two compound terms which have the same functor and arity, e.g.,

    p(T11, ..., T1n)  and  p(T21, ..., T2n)

(1) Unifying the arguments in left-to-right order, and
(2) Unifying the arguments in right-to-left order.

Is it possible to determine (by writing a Prolog program)
which of these two orders is used for given a Prolog system,
and, if so, how?

Affirmative answers determined by looking in the manual,
poking through source code, calling the implementors, etc.
are *not* considered legitimate!

I know one valid (I think) affirmative answer,
but I'm sure there are lots of others...

Credit is due to Kim Marriott and Harald Sondergaard
(Department of Computer Science, University of Melbourne)
for inspiring the question.
----------------------------------------------------------------------------
Francois-Michel Lang
Paoli Research Center, Unisys Corporation lang@prc.unisys.com (215) 648-7256
Dept of Comp & Info Science, U of PA      lang@cis.upenn.edu  (215) 898-9511

ok@quintus.uucp (Richard A. O'Keefe) (11/04/88)

In article <8192@burdvax.PRC.Unisys.COM> lang@zeta.PRC.Unisys.COM (Francois-Michel Lang) writes:
>Is it possible to determine (by writing a Prolog program)
>which of these two orders is used for given a Prolog system,
>and, if so, how?

>Credit is due to Kim Marriott and Harald Sondergaard
>(Department of Computer Science, University of Melbourne)
>for inspiring the question.

Anyone who has read Marriott & Sondergaard's paper on the Occurs Check
problem will think they know that the answer is yes, and will have some
idea of what to try.

However, the answer is *no*.  The reason is that there are lots of other
things a Prolog system could do.  For example, the compiler could pick
either order at random, or based on some property such as the spelling
of the function symbol.  Or the compiler might quite plausibly do some
of the arguments out of order if they look nice (I think I wrote a note
about this at Edinburgh).  So f(X,Y+Z,W) might have its arguments
processed in a different order from f(X,W,Y+Z).  More to the point, the
Prolog system might, like SICStus Prolog, have a unification algorithm
which can hack cyclic terms (though other parts of SICStus Prolog didn't,
the last time I tried).

lang@zeta.PRC.Unisys.COM (Francois-Michel Lang) (11/05/88)

In article <629@quintus.UUCP>, ok@quintus.uucp (Richard A. O'Keefe) writes:
> In article <8192@burdvax.PRC.Unisys.COM> lang@zeta.PRC.Unisys.COM (Francois-Michel Lang) writes:
> >Is it possible to determine (by writing a Prolog program)
> >which of these two orders is used for given a Prolog system,
> >and, if so, how?
...
> However, the answer is *no*.  The reason is that there are lots of other
> things a Prolog system could do.
...

I think my original question remains valid,
if we assume that the Prolog system one is running
(1) does indeed use left -> right or right -> left order,
(2) uses it consistently, and
(3) can't handle cyclic terms.
This last point is very easy to check.
----------------------------------------------------------------------------
Francois-Michel Lang
Paoli Research Center, Unisys Corporation lang@prc.unisys.com (215) 648-7256
Dept of Comp & Info Science, U of PA      lang@cis.upenn.edu  (215) 898-9511

lee@munnari.oz (Lee Naish) (11/07/88)

With right to left unification the following fails.  With left to
right unification without the occurs check creates two infinite terms
then tries to unify them, which will (probably) loop.

?- p(C-C, A-A, A, a).

p(D-f(D), B-f(B), D, b).


It seems that most systems use left to right ordering but POPLOG uses
right to left (thanks Chris Moss, who ran lots of examples on lots of
systems).  My guess is that POPLOG pushes unifications onto a stack at
some point.  Richard is right in saying that you can never tell for
sure of course.  I also disagree with the original assertion that the
only two reasonable orders are L to R and R to L.  See the paper by
Janssens Demoen and Marien in the latest ICLP/SLP on optimizing
register allocation.

There are also some other subtle differences in the way cyclic terms
are handled by the various implementations (not to mention the
unsubtle differences of systems which handle infinite terms properly).
The following example will loop in some but not all L to R
implementations:

?- p(A-A, A, a).

p(f(B)-B, B, b).

The difference between this and the example above is that only one
cyclic term is created and it is unified with itself.  Many unification
procedures first check if the addresses of the two terms are the same
and succeed if that is the case.

Another difference is whether the infinite loops are interruptable and
whether they consume memory (due to recursion in unification).
Running out of space in the unification stack causes user-unfriendly
behaviour in many systems.  It occurs less often if the unification
procedure is tail recursive, so it doesn't consume memory if an
infinite (sub)term is always in the "last" argument of a term (last
with respect to the unification order of course).  Assuming the
system uses tail recursion, this is another way to find out the
unification order!

	lee

ok@quintus.uucp (Richard A. O'Keefe) (11/08/88)

In article <8210@burdvax.PRC.Unisys.COM> lang@zeta.PRC.Unisys.COM (Francois-Michel Lang) writes:
>> In article <8192@burdvax.PRC.Unisys.COM> lang@zeta.PRC.Unisys.COM (Francois-Michel Lang) writes:
>> >Is it possible to determine (by writing a Prolog program)
>> >which of these two orders is used for given a Prolog system,
>if we assume that the Prolog system one is running
>(1) does indeed use left -> right or right -> left order,
>(2) uses it consistently, and
>(3) can't handle cyclic terms.

Why not simply assume that it uses right->left order consistently and make
the problem _really_ easy?  My earlier reply did not exhaust all the things
that a Prolog system might sensibly do instead.  Imagine a Prolog system
where the compiler replaces all variables in the head by new distinct
variables, and then at the end unifies repeated variables pairwise, e.g.
	p([H|T], L, [H|R]) :-
=>	p([H|T], L, [H'|R]) :- H = H'
If the goal is acyclic, the (simplified) head unification is order-
independent; what matters is the order in which the repeated variables
are handled.  And that might quite plausibly depend on the spelling of
the source variables.

I tell you plainly that Quintus Prolog violates Lang's assumptions.

lang@zeta.PRC.Unisys.COM (Francois-Michel Lang) (11/09/88)

As the instigator of what I had intended as nothing more than
a fairly simple and informal puzzle, I apologize for having
roused the infamous O'Keefe dragon and prompted him to unleash
yet another onslaught of his matchless prose.  (Actually, 
in defense of Richard, this onslaught wasn't nearly so bad
as many I've witnessed before, but I still think OK
might have fit in nicely on George Bush's campaign staff -:) )

I stand corrected in my claim that LR and RL argument evaluation
are the only two "reasonable" orders.  Lee and Richard,
would you perhaps grant that LR and RL are "the two most
obvious or most commonly used" orders, which is perhaps
what I should have said to begin with?

Thanks to Lee for posting a response to my original question
which correctly answers it in the *spirit* in which it was asked!


----------------------------------------------------------------------------
Francois-Michel Lang
Paoli Research Center, Unisys Corporation lang@prc.unisys.com (215) 648-7256
Dept of Comp & Info Science, U of PA      lang@cis.upenn.edu  (215) 898-9511

ok@quintus.uucp (Richard A. O'Keefe) (11/10/88)

In article <8249@burdvax.PRC.Unisys.COM> lang@zeta.PRC.Unisys.COM (Francois-Michel Lang) writes:
>As the instigator of what I had intended as nothing more than
>a fairly simple and informal puzzle, I apologize for having
>roused the infamous O'Keefe dragon and prompted him to unleash
>yet another onslaught of his matchless prose.

Checking a dictionary, I find
    infamous a. 1. notorious 2. shocking
    notorious a. 1. known for something bad ...

>I stand corrected in my claim that LR and RL argument evaluation
>are the only two "reasonable" orders.  Lee and Richard,
>would you perhaps grant that LR and RL are "the two most
>obvious or most commonly used" orders, which is perhaps
>what I should have said to begin with?

The most obvious, certainly.  The most commonly used?  I have no idea.
What I can be reasonably confident about is that few if any WAM-based
compilers will use either order.  "Usually LR" is what you'll get.
I don't know about PopLog, but it *used* to use predominantly RL,
but it would break off chunks of the head and hand them to a general
unification routine if the head was too complex.  Early versions of
Prolog-X were strictly LR, mainly to make decompilation easier.

>Thanks to Lee for posting a response to my original question
>which correctly answers it in the *spirit* in which it was asked!

As I pointed out, if one has read the relevant paper (and I have),
the intended answer was obvious, so I _couldn't_ solve the puzzle.
All I could do was point out that the obvious answer is wrong.
I'm sorry if that seems like an onslaught (=attack).

lindsay@comp.vuw.ac.nz (Lindsay Groves) (11/11/88)

In article <629@quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes:
>...  So f(X,Y+Z,W) might have its arguments
>processed in a different order from f(X,W,Y+Z).

I have often wondered about this.  It would seem to be a fairly simple 
optimisation in a Prolog compiler to look at the arguments in a call and
process them in an order that does the easiest things first.  For example:
  - check constant arguments first; these are easy checks to perform, and if
    one fails, that saves a lot of other work.
  - next check functors on compound terms; same reason as above, but it takes 
    a bit more work to do.
  - next handle first occurrences of variables; these only require bindings to
    be created.
  - finally handle second and subsequent occurrences of variables; this is the
    only place that the full unification algorithm needs to be invoked, so it
    shoulb de put off as long as possible.

This is bit simplified; e.g. it overlooks the question of when to process 
arguments of compound terms (that may depend on the representation used, and
how much effort is required to get at components), but I think the general
idea is clear enough.

Has anyone investigated this kind of thing?  What kind of improvement can be
expected?

Do any of the available (or unavailable, for that matter) Prolog compilers do
this?

-- 

	Lindsay Groves

	Logic programmers' theme song: "The first cut is the deepest"

micha@ecrcvax.UUCP (Micha Meier) (11/17/88)

In article <14359@comp.vuw.ac.nz> lindsay@comp.vuw.ac.nz (Lindsay Groves) writes:
>  - check constant arguments first; these are easy checks to perform, and if
>    one fails, that saves a lot of other work.
>  - next check functors on compound terms; same reason as above, but it takes 
>    a bit more work to do.
>  - next handle first occurrences of variables; these only require bindings to
>    be created.
>  - finally handle second and subsequent occurrences of variables; this is the
>    only place that the full unification algorithm needs to be invoked, so it
>    shoulb de put off as long as possible.

	This is basically right, you only have to precisely know why do you
	choose a specific order: if the unification succeeds, it does
	not matter in which order the arguments are unified. If it fails,
	however, we want to find it out as early as possible so that:
		
		- we didn't make too many (now redundant) operations
		  before the failure
		- the number of actions necessary to undo the failed
		  unification is minimized

	This means that first occurrences of variables should be handled
	at the end, because they cannot cause failure. 
	The order depends as well on possible mode declarations (input
	before output) and on the size or "groundness" of compound terms.
	The (unavailable :-)) compiler of Sepia makes these optimizations.


--Micha