matsc@sics.se (Mats Carlsson) (09/07/88)
This note is about the operational semantics of bagof/3 and setof/3. As we all know, a goal setof(Y, p(X,Y), L) yields alternative solutions corresponding to different values of X. But just what is a "different value"? The public domain code for bagof(Term, Goal, Solutions) uses the following method: 1. compute a tuple K of those variables which are global in Goal, 2. compute L1 as findall(K-Term, Goal, L1), * 3. for each member K1-Term1 in L1, if an argument of K1 is unbound, bind it to the corresponding argument of the original tuple K, 4. keysort the K1-Term1 pairs and group them by identical K1 into a list Groups of pairs K1-TermList, 5. member(K-Solutions, Groups) yields a solution to bagof/3. This is all well as long as a global variable is either bound to a ground term or not bound at all in a solution. But when bound to a non-ground term, I find the result counterintuitive. Consider the program p(X, a) :- p(X). p(X, b) :- p(X). p(_). p(f(_)). p(f(_,_)). p(f(X,X)). :- setof(Y, p(X,Y), L). which yields X = _356, L = [a,b] ; X = f(_608), L = [b] ; X = f(_712), L = [a] ; X = f(_554,_554), L = [b] ; X = f(_581,_582), L = [b] ; X = f(_658,_658), L = [a] ; X = f(_685,_686), L = [a] ; Thus the values of X were considered equivalent for both values of Y only when X was not bound at all. I would like to argue for another criterion for treating two instances of the global variables as equivalent: treat them as equivalent if they are identical, _modulo renaming of variables_. This relation between terms is usually called "variant". Thus 'p(_10,_11)' is a variant of 'p(_12,_13)' but not of 'p(_14,_14)'. Using this criterion, the solutions to the above query becomes X = _49, L = [a,b] ; X = f(_141), L = [a,b] ; X = f(_141,_141), L = [a,b] ; X = f(_165,_141), L = [a,b] ; an answer which I find rather less surprising. It takes a change in step 3 of the above algorithm to enforce this behaviour. Admittedly, it makes the algorithm more expensive. I have an implementation which is O(m*n**2) in time and O(n) in space, where m is the number of proofs of Goal and n is the average number of variable occurrences in an instance of the "global variables tuple". I'm sure there are more clever algorithms. Are there any other snags? Needless to say, this discussion is only relevant for unsound uses of bagof/3 and setof/3.