[comp.lang.prolog] bagof & setof

grosse@lan.informatik.tu-muenchen.dbp.de (Malte Grosse) (11/21/89)

A few weeks ago there was a request for bagof & setof, since 
it is not build-in in every PROLOG. With the help of previous
announcements I was able to implement bagof and setof. But I 
have still some trouble to find out the "free_variables" (these
are those which are neither in the template nor in a existential
quantifier). 

I have the following questions:

	- is
		    bagof(X, (Y^fact1(X,Y),Z^fact2(X,Z)), L)
	  equal to
		    bagof(X, Y^Z^(fact1(X,Y),fact2(X,Z)), L)
	  (I assume it)


	- is
		    bagof(X, (Y^fact1(X,Y,Z),Z^fact2(X,Z)), L)
	  equal to
		    bagof(X, Y^Z^(fact1(X,Y,Z),fact2(X,Z)), L)
	  (I assume it aswell)


	- is
		    bagof(X,Y^(fact(X,Y),!),L)
	  the same as
		    bagof(X,(fact(X,Y),!),L)
	  the same as
		    fact(X,_),L=[X],!
	  (quite useless I know)


	- where does
		    bagof(X, (Y^fact1(X,Y,Z);Z^fact2(X,Z)), L)
	  yield to?

	  to
		    bagof(X, Y^Z^(fact1(X,Y,Z);fact2(X,Z)), L)
 	  or to
		    bagof(X, Y^Znew^(fact1(X,Y,Z);fact2(X,Znew)), L)
			(Does anyone know how to handle this?)
	  or 
	     does it give ALWAYS only one instance (the first),
	     since the 'fail' to get the solution is a level higher
	     than the two facts, because of the 'or'.


I really interested in receiving the opinions from other PROLOG programmers.
Or if possible references to literature.

Malte

please reply to:
	grosse%lan.informatik.tu-muenchen.dbp.de@relay.cs.net
---------------------------------------------------------------------------

TTTTTTT  U      U  M       M       	  TECHNICAL UNIVERSITY OF MUNICH
   T	 U      U  MM     MM	   	  INSTITUT FUER INFORMATIK
   T     U      U  M M   M M 	   	  ORLEANSSTRASSE 34
   T     U      U  M  M M  M	   D-8000 MUENCHEN 80
   T     U      U  M   M   M 
   T      UUUUUU   M       M 	   	  WEST-GERMANY

ok@mudla.cs.mu.OZ.AU (Richard O'Keefe) (11/23/89)

In article <2177@infovax.lan.informatik.tu-muenchen.dbp.de>, grosse@lan.informatik.tu-muenchen.dbp.de (Malte Grosse) writes:
> A few weeks ago there was a request for bagof/3 & setof/3, since 
> it is not built into every Prolog.  With the help of previous
> announcements I was able to implement bagof/3 and setof/3.

Sorry, I seem to have missed your posting.  There is a public-domain
implementation of setof/3 and bagof/3 adapted from the DEC-10 Prolog
system; it includes a predicate free_variables/4.  See below.

> 	- is
> 		    bagof(X, (Y^fact1(X,Y),Z^fact2(X,Z)), L)
> 	  [equivalent] to
> 		    bagof(X, Y^Z^(fact1(X,Y),fact2(X,Z)), L)
> 	  (I assume it)

It is in the public-domain implementation.  However, you should be aware
that there are some Prologs which disagree (which is why we need a standard
that will standardise it instead of dropping it).  In particular, quite a
few take the easy way out and say that in a term like
	X^Y^Z^(....)
(which is parsed as ^(X, ^(Y, ^(Z, (...)) ))	)
the things bound by the top-level existential quantifiers are bound but
once you come to something which is not _^_ you stop looking for quantifiers.

> 	- is
> 		    bagof(X, (Y^fact1(X,Y,Z),Z^fact2(X,Z)), L)
> 	  [equivalent] to
> 		    bagof(X, Y^Z^(fact1(X,Y,Z),fact2(X,Z)), L)
> 	  (I assume it as well)

Neither in the public-domain version, nor in the "quick and dirty" versions
mentioned above, nor in logic would this be true.  If we took logic seriously,
the first would be equivalent to
		bagof(X, (Y1^fact1(X,Y,Z), Z1^fact2(X,Z)), L)
and the second would be equivalent to
		bagof(X, Y1^Z1(fact1(X,Y1,Z1), fact2(X,Z1)), L)

Note that the answer is different for the first question because pushing
the quantifiers Y, Z further out does not result in bogus capture of free
variables.  Here it does.

> 	- is
> 		    bagof(X,Y^(fact(X,Y),!),L)
> 	  the same as
> 		    bagof(X,(fact(X,Y),!),L)

no way.  The second one will return a binding for Y.  The first one
won't.  Why should a cut change anything?

> 	  the same as
> 		    fact(X,_),L=[X],!

Again, no.  The rule is that setof/3 and bagof/3 find bindings for the
FREE variables, and a variable is only free if it appears NEITHER within
the scope of an existential quantifier (the point you seem to be missing
in an earlier question is the "within the scope" bit) NOR within the
template (the first argument).  So provided X does not occur in L already,
bagof(X, ..., L) will not instantiate X.  But the query
		fact(X, _), !, L = [X]
*will* instantiate X.  

> 	- [what] does
> 		    bagof(X, (Y^fact1(X,Y,Z);Z^fact2(X,Z)), L)
> 	  [transform] to?
> 	  to
> 		    bagof(X, Y^Z^(fact1(X,Y,Z);fact2(X,Z)), L)
>  	  or to
> 		    bagof(X, Y^Znew^(fact1(X,Y,Z);fact2(X,Znew)), L)

There are again at least three answers:

(a) the logical answer.
    Think of W^(...W...) as (exists W)(...W...).
    (Y^fact1(X, Y, Z) ; Z^fact2(X, Z))
    thus stands for the logical formula
    (exists Y)fact1(X, Y, Z) or (exists Z)fact2(X, Z)
    which, by renaming bound variables, is equivalent to
    (exists %1)fact1(X, %1, Z) or (exists %2)fact2(X, %2)

    Neither of your suggested translations matches the logical reading.

(b) what the public-domain implementation does.
    It scans over the goal collecting variables which
    -- do not occur in the template (so X would be ignored)
    -- are not within the scope of an existential quantifier binding them
       (so Y would be ignored and the second occurrence of Z would be
       ignored)
    but it does not *rename* the existentially quantified variables.
    So it would see the first occurrence of Z, and would treat this query
    as
	bagof(X, Y^(fact1(X,Y,Z) ; fact2(X,Z)), L)

    This is a MISTAKE in the public-domain implementation; the first person
    to point it out to me was David S. Warren.  Note that this has nothing
    to do with whether it is an "and then" or an "or else" between the two
    goals.

(c) The quick-and-dirty method used by several Prolog systems says that
    since there are no OUTERMOST existential quantifiers, none of the
    variables are existentially quantified.

> 	  or 
> 	     does it give ALWAYS only one instance (the first),
> 	     since the 'fail' to get the solution is a level higher
> 	     than the two facts, because of the 'or'.

I don't understand this.  Level?  The basic outline of bagof/3 is

	bagof(Template, Goal, Bag) :-
		free_variables(Template, Goal, [], Vars),
		(   Vars = [] ->
		    findall(Template, Goal, Bag)
		;   findall(Vars-Template, Goal, RawSolns),
		    <magic>
		    keysort(RawSolns, OrdSolns),
		    group_with_matching_keys(OrdSolns, Vars, Bag)
		).

where
	findall(Template, Goal, List) :-
		(   push special marker on stack,
		    call(Goal),
		    push Template instance on stack,
		    fail
		;   collect Template instances from stack as List
		).

findall/3 is completely blind to the structure of the Goal.  bagof/3 cares
only to the extent that it influences the scan for existential quantifiers.
The POINT of findall/3 and bagof/3 is to find ALL the solutions, not to
cut off at some bizarre place.

I am of the opinion that the "right" reading for the existential quantifiers
is the logical one, and a good way to implement that is to treat them as if
they were local variables and construct a new term with those variables
renamed.  For example,
	(X^p(X,Y), Y^p(X,Y))
should be treated returned as
	Goal' = (p(_1,Y), p(X,_2)), Free Vars = [X,Y]

[By the way, this will be my last posting for a couple of months.]

Here is the public-domain code for free_variables/4.  I warn you, the mistake
is present in this version.  I did think about writing a new version with the
mistake corrected and broadcasting it, but since WG17 Prolog hasn't got
setof/3 or bagof/3 it didn't seem worth the trouble.

%   In order to handle variables properly, we have to find all the
%   universally quantified variables in the Generator.  All variables
%   as yet unbound are universally quantified, unless
%       a)  they occur in the template
%       b)  they are bound by X^P, setof, or bagof
%   free_variables(Generator, Template, OldList, NewList)
%   finds this set, using OldList as an accumulator.

free_variables(Term, Bound, VarList, [Term|VarList]) :-
	var(Term),
	term_is_free_of(Bound, Term),
	list_is_free_of(VarList, Term),
	!.
free_variables(Term, _, VarList, VarList) :-
	var(Term),
	!.
free_variables(Term, Bound, OldList, NewList) :-
	explicit_binding(Term, Bound, NewTerm, NewBound),
	!,
	free_variables(NewTerm, NewBound, OldList, NewList).
free_variables(Term, Bound, OldList, NewList) :-
	functor(Term, _, N),
	free_variables(N, Term, Bound, OldList, NewList).


free_variabels(N, Term, Bound, OldList, NewList) :-
	(   N =:= 0 -> NewList = OldList
	;   arg(N, Term, Argument),
	    free_variables(Argument, Bound, OldList, MidList),
	    M is N-1,
	    free_variables(M, Term, Bound, MidList, NewList)
	).

 
%   explicit_binding checks for goals known to existentially quantify
%   one or more variables.  In particular \+ is quite common.  It is
%   known that the first argument is instantiated.
 
explicit_binding(\+ _,                 Bound, fail,     Bound    ).
explicit_binding(Var^Goal,             Bound, Goal,     Bound+Var).
explicit_binding(setof(Var,Goal,Set),  Bound, Goal-Set, Bound+Var).
explicit_binding(bagof(Var,Goal,Bag),  Bound, Goal-Bag, Bound+Var).
explicit_binding(findall(_,_,Bag),     Bound, Bag,      Bound    ).
 
 
term_is_free_of(Term, Var) :-
	(   var(Term) ->
	    Term \== Var
	;   functor(Term, _, N),
	    term_is_free_of(N, Term, Var)
	).

term_is_free_of(N, Term, Var) :-
	(   N =:= 0 -> true
	;   arg(N, Term, Argument),
	    term_is_free_of(Argument, Var),
	    M is N-1,
	    term_is_free_of(M, Term, Var)
	).


list_is_free_of([], _).
list_is_free_of([Head|Tail], Var) :-
	Head \== Var,
	list_is_free_of(Tail, Var).

grosse@lan.informatik.tu-muenchen.dbp.de (Malte Grosse) (11/30/89)

In article <2783@munnari.oz.au> Richard O'Keefe writes:

>
>Sorry, I seem to have missed your posting.  There is a public-domain
>implementation of setof/3 and bagof/3 adapted from the DEC-10 Prolog
>system; it includes a predicate free_variables/4.  See below.
>
It was not my posting, it was from Mazda
	mazda%shako.sk.tsukuba.ac.jp@cunyvm.cuny.edu

>> 	- is
>> 		    bagof(X, (Y^fact1(X,Y,Z),Z^fact2(X,Z)), L)
>> 	  [equivalent] to
>> 		    bagof(X, Y^Z^(fact1(X,Y,Z),fact2(X,Z)), L)
>> 	  (I assume it as well)
>
>Neither in the public-domain version, nor in the "quick and dirty" versions
>mentioned above, nor in logic would this be true.  If we took logic seriously,
>the first would be equivalent to
>		bagof(X, (Y1^fact1(X,Y,Z), Z1^fact2(X,Z)), L)

assuming you mean:
		bagof(X, (Y1^fact1(X,Y1,Z),Z1^fact2(X,Z1)),L)

but this cuts off the coreference of 'Z' to fact1. Is this alright?
It is the logical way ok, but this is not the way PROLOG works.

>and the second would be equivalent to
>		bagof(X, Y1^Z1(fact1(X,Y1,Z1), fact2(X,Z1)), L)
>
>Note that the answer is different for the first question because pushing
>the quantifiers Y, Z further out does not result in bogus capture of free
>variables.  Here it does.
>

>> 	- is
>> 		    bagof(X,Y^(fact(X,Y),!),L)
>> 	  the same as
>> 		    bagof(X,(fact(X,Y),!),L)
>
>no way.  The second one will return a binding for Y.  The first one
>won't.  Why should a cut change anything?
>
>> 	  the same as
>> 		    fact(X,_),L=[X],!
>
>Again, no.  The rule is that setof/3 and bagof/3 find bindings for the
>FREE variables, and a variable is only free if it appears NEITHER within
>the scope of an existential quantifier (the point you seem to be missing
>in an earlier question is the "within the scope" bit) NOR within the
>template (the first argument).  So provided X does not occur in L already,
>bagof(X, ..., L) will not instantiate X.  But the query
>		fact(X, _), !, L = [X]
>*will* instantiate X.  
>


You are right, I have not thought about the instantiation of X and Y.
I was looking at the result of the list L.
		bagof(X,Y^(fact(X,Y),!),L).
is the same as
		findall(X,(fact(X,Y),!),L).
But this will fail according to the !,fail structure, 
(depending on the implementation)
which is coming into findall.
--> see my comments on this

To my opinion this findall should give for 'L' the same as 
fact(_1,_),L=[_1],!,
which I tried to express.

>> 	- [what] does
>> 		    bagof(X, (Y^fact1(X,Y,Z);Z^fact2(X,Z)), L)
>> 	  [transform] to?
>> 	  to
>> 		    bagof(X, Y^Z^(fact1(X,Y,Z);fact2(X,Z)), L)
>>  	  or to
>> 		    bagof(X, Y^Znew^(fact1(X,Y,Z);fact2(X,Znew)), L)
>
>There are again at least three answers:
>
>(a) the logical answer.
>    Think of W^(...W...) as (exists W)(...W...).
>    (Y^fact1(X, Y, Z) ; Z^fact2(X, Z))
>    thus stands for the logical formula
>    (exists Y)fact1(X, Y, Z) or (exists Z)fact2(X, Z)
>    which, by renaming bound variables, is equivalent to
>    (exists %1)fact1(X, %1, Z) or (exists %2)fact2(X, %2)
>
>    Neither of your suggested translations matches the logical reading.

I think my third translation is equivalent. But anyway I do agree.


>...
>where
>	findall(Template, Goal, List) :-
>		(   push special marker on stack,
>		    call(Goal),
>		    push Template instance on stack,
>		    fail
>		;   collect Template instances from stack as List
>		).
>
>findall/3 is completely blind to the structure of the Goal.  bagof/3 cares
>only to the extent that it influences the scan for existential quantifiers.
>The POINT of findall/3 and bagof/3 is to find ALL the solutions, not to
>cut off at some bizarre place.
>

Exactly! But with this realisation it can cause serious problems! 
I have the opinion that 'call' should be 'call_as_subgoal'
which is: 

call_as_subgoal(X):-call(X).

Because this remains stable in case there is a cut in the goal;
your version - which is maybe the version of us all (?!?)-
will finally fail findall and remains the stack changed!

*** I have just found in a CProlog manual under call(X):
"... the goal call(X) is executed exactly as if that term appeared textually
in place of the call(X), except that any CUT occuring in X will remove
only those choice points in X. ..."
***

It depends now how your 'call' is implemented. Our findall is broken,
maybe because of another implementation of 'call'. In Clocksin&Mellish
it is just a replacement.

This was the reason of my second question, although I did not 
take into consideration that the FREEVARIABLES should stay
uninstantiated.

********************************* session follows *********************

?- listing.

yes
?- asserta(we(7)).

yes
?- listing.
we(7).


yes
?- findall(Q,we(Q),L).

Q	= _636
L	= [7];

no
?- listing.
we(7).


yes
?- findall(Q,(we(Q),!),L).

no
?- listing.
$found(7,_716).
$found(_712,$).

we(7).


yes

************************************* end of session *****************

I really like to hear from others how their bagof or findall is behaving
in these cases. 

Malte

please reply to:
	grosse%lan.informatik.tu-muenchen.dbp.de@relay.cs.net
---------------------------------------------------------------------------

TTTTTTT  U      U  M       M       	  TECHNICAL UNIVERSITY OF MUNICH
   T	 U      U  MM     MM	   	  INSTITUT FUER INFORMATIK
   T     U      U  M M   M M 	   	  ORLEANSSTRASSE 34
   T     U      U  M  M M  M	   D-8000 MUENCHEN 80
   T     U      U  M   M   M 
   T      UUUUUU   M       M 	   	  WEST-GERMANY