[comp.lang.prolog] error in bagof/3?

torbenm@diku.dk (Torben [gidius Mogensen) (04/09/90)

I recently had a very elusive bug in a prolog program I wrote with a
college. After lengthy searching, we tracked the bug to an unexpected
behaviour of bagof/3. We simplified the problem to this program:



q(X) :-
	G = [Y, [1,2]],
	bagof(W, (G = [W,L],member(W,L)),X).

member(X,[X|_]).
member(X,[_|L]) :- member(X,L).


We would expect "?- q(X)." to return the single solution "X = [1,2]",
but to our surprise we got the to solutions "X = [1]" and "X = [2]".
We tried this both in Sicstus prolog and C-prolog, with identical
results. Is this the correct behaviour? 

If we insert a "copy_term(G,G1)" in the beginning of the inner goal,
and use "G1 = [W,L]", we get the same result, so the problem is not
that we change binding of variables outside the "bagof". If we move
either "Y" or the list "[1,2]" inside the "bagof", we get the expected
solution.

To solve our immediate problem, we programmed our own version of
"bagof", using "recordz" etc. After this, our program not only
executed correctly, but also ran 10 times faster in the cases where
the earlier version also worked (this was in Sicstus prolog). Our
"bagof" was specialized to our specific use, but still it was
surprising to see such a large difference.

regards,

Torben Mogensen (torbenm@diku.dk)

ok@goanna.oz.au (Richard O'keefe) (04/10/90)

In article <1990Apr9.105812.168@diku.dk>, torbenm@diku.dk (Torben [gidius Mogensen) writes:
> q(X) :-
> 	G = [Y, [1,2]],
> 	bagof(W, (G = [W,L],member(W,L)),X).

What does this query mean?  It says
	please, Mr Bagof, ENUMERATE SOLUTIONS FOR Y,L,X such that
		(G = [W,L], member(W,L))
	has at least one solution, and such that X is the list of
	all the values of W thus found for those particular values
	of Y and L.

The hairy thing here is the variable Y occurring in G.
The goal ([Y,[1,2]] = [W,L], member(W, L)) finds *TWO* solutions
for Y, namely Y = 1 and Y = 2.  Therefore, if it is operating
correctly, bagof/3 **MUST** report two solutions, one binding Y=1
and one binding Y=2.  If you type the query
	?- G = [Y,[1,2]],
	   bagof(W, (G = [W,L], member(W, L)), X).
at top level, you *MUST* get the two answers

	X = [1],
	L = [1, 2],
	W = W,
	Y = 1,
	G = [1, [1, 2]] ;

	X = [2],
	L = [1, 2],
	W = W,
	Y = 2,
	G = [2, [1, 2]]

if your implementation of bagof/3 has any pretence to correct operation.
(I got these answers from NU Prolog.)  The order of the answers is not a
defined aspect of bagof/3, but that there must be two answers in this
case IS.

If you want the bindings for the variable Y to be ignored, you must
explicitly say so:
	.. bagof(W, Y^(G = [W,L], member(W, L)), X) ..

When the Generator of an all solutions predicate has the form (P,Q) where
P is determinate, it is usually better to move P outside the call, e.g.
	bagof(T, (P,Q), L)
=>	P, bagof(T, Q, L)
{Note: the fact that bagof/3 fails when there are no solutions is one of
the reasons why this transformation works.}  Simplifying the code thus
often reduces problems with variables.  In this case we would get
	G = [Y,[1,2]],
	G = [W,L], bagof(W, member(W, L), X)
which has the single solution
	X = [1, 2],
	L = [1, 2],
	W = W,
	Y = W,
	G = [W, [1, 2]]
as found by NU Prolog.

> We would expect "?- q(X)." to return the single solution "X = [1,2]",
> but to our surprise we got the to solutions "X = [1]" and "X = [2]".
> We tried this both in Sicstus prolog and C-prolog, with identical
> results.  Is this the correct behaviour? 

It is the only correct behaviour.  You should have expected q(X) to
produce two answers.

> To solve our immediate problem, we programmed our own version of
> "bagof", using "recordz" etc.

I hope you mean record<<a>> not record<<z>>.  It sounds as though what
you wanted in this case was findall/3.

> [a comment on a fairly large speedup]

bagof/3 has to parse the generator in order to locate the free variables
(the variables for which substitutions are to be enumerated; in this case
Y and L were free variables in the generator).  If there aren't many
solutions and the solutions are small, the overhead can be appreciable,
especially if you have lots and lots of free variables.  It is impossible
to evaluate the reported 10-fold speedup without knowing how large the
solutions were, how many there were, and how many free variables there
were in the generators.