OKeefe.R.A.%EDXA@sri-unix.UUCP (10/19/83)
From: Richard HPS (on ERCC DEC-10) <OKeefe.R.A.@EDXA> A certain puzzle solution in this Digest contained the following rather odd goal: bagof(M, M^(x(M, Z), nonvar(Z), Z = N), L) [BTW: that program is an *excellent* debugging tool for Prolog implementors. If your assert and retract implementations stand up to amazing thrashing this program gives them, they'll stand up to any amount of normal use. End of insult. End of tip.] One of the odd things is just the x(M,Z), nonvar(Z), Z=N bit itself. N is always instantiated to an integer, so x(M,Z), Z==N would be exactly what is wanted. However, the x(M,Z) facts in the data base always have M an integer, and Z either an integer or a variable. Being a variable indicates "not yet assigned", and writing x(M,unbound) for that case would (a) eliminate the var/nonvar tests in the program and (b) let this goal be written as just x(M,N). But back to bagof. If you haven't got a copy of the 15-Dec-81 manual, here are pages 51 & 52, retyped: setof(X, P, S) Read this as "S is the set of all instances of X such that P is provable, where that set is non-empty". The term P specifies a goal or goals as in call(P). S is a set of terms represented as a list of those terms, without duplicates, in the standard order for terms [this is defined elsewhere in the manual, and I'm not going to type that in]. If there are no instances of X such that P is provable the predicate *fails*. The variables appearing in the term X should not appear anywhere else in the clause except within the term P. Obviously, the set to be enumerated should be finite, and should be enumerable by Prolog in finite time. It is possible for the provable instances to contain variables, but in this case the list S will only provide an imperfect representation of what is in reality an infinite set. [There is no real need for the variables in X to be absent from the rest of the clause. If a variable is bound, all that means is that its binding will be copied into the instances in the list S. If a variable is not bound, it will still be unbound after the call to setof, just like 'not'. And I don't know that I'd call a non-ground term an "imperfect representation" of an infinite set.] [*** This is the key paragraph ***] If there are uninstantiated variables in P which do not also appear in X, then a call to this evaluable predicate may **backtrack**, generating alternative values for S corresponding to different instantiations of the free variables of P. (It is to cater for such usage that the set S is constrained to be non-empty.) For example, the call ?- setof(X, X likes Y, S). might produce two alternative solutions via backtracking: Y = beer, S = [dick, harry, tom] Y = cider, S = [bill, jan, tom] (X remains unbound in both cases). The call ?- setof((Y,S), setof(X, X likes Y, S), SS). would then produce SS = [(beer,[dick,harry,tom]), (cider,[bill,jan,tom])] [*** This paragraph explains ^ ***] Variables occurring in P will not be treated as free if they are explicitly bound within P by an existential quantifier. An existential quantification is written Y^Q meaning "there exists a Y such that Q is true", where Y is some Prolog variable. For example, ?- setof(X, Y^(X likes Y), S). would produce the single result X = [bill, dick, harry, jan, tom] in contrast to the earlier example. bagof(X, P, Bag) This is exactly the same as setof except that the list (or alternative lists) returned will not be ordered, and may contain duplicates. The effect of this relaxation is to save considerable time and space in execution. [This may be misleading. There are three components to the cost of setof, and two to bagof. Both of them enumerate all solutions like findall. That is one component. Both then do a sort to bring together solutions belonging to the same alternative (bindings for free variables in P). That is the second component. setof then does another sort for each list of alternatives. So you save O(N.lgN) time and space by using bagof instead of setof. But that is less than 50% of the cost of bagof. When there are NO free variables in P, bagof and setof don't do the first sort, so the saving for this common case can be considerable. Also bear in mind that the cost of sorting is strongly implementation dependent; it could take O(N) space.] The interpreter recognises this as meaning "there exists an X such that P is true", and treats it as equivalent to call(P). The use of this explicit existential quantifier outside the setof and bagof constructs is superfluous. [End of manual extract.] What is odd about bagof(M, M^(x(M, Z), nonvar(Z), Z=N), L) is thus that a variable which ought NOT to be explicitly quantified IS (M), and that a variable which SHOULD be (Z) ISN'T. How come it works ? To see why quantifying a variable that appears in the template doesn't confuse Prolog at all while it confuses people badly, you'll have to look at the code I published here recently. To see why not quantifying Z does no harm, you have only to realise that all instances of Z are unified with the same *integer* N, so that bagof does a findall and a sort, and then discovers that all of the solutions have the same value for Z. If the goal had been instead bagof(M, Z^(x(M, Z), nonvar(Z), Z=N), L) bagof would just have done a findall. Now I don't mean to imply that the author of the program in question doesn't understand the X^P notation perfectly well. This may be just the kind of typing mistake we all make, which just increased the running time without producing wrong answers, and could easily go unnoticed for 20 years. But there seem to be too many Prologs that lack bagof, and I wouldn't want anyone to take this particular goal as an example to be imitated. By the same token, there could well be similar mishaps in some of my utilities, and I would be grateful if someone spotted them while I still have access to this net. One thing which the manual should mention and doesn't is that when you compile a DEC-10 Prolog program that uses setof, bagof, or for that matter findall, it often stops working. The problem is that the interface between compiled and interpreted code is mostly one-way: compiled code can get at any interpreted code, either by using call or just by calling a predicate that happens not to be interpeted. But the interpreter can only get at compiled code which has :-public declarations. And call is the interpreter. So the generators in setof(Template, Generator, Set) bagof(Template, Generator, Bag) findall(Template, Generator, List) and the goals in \+ Goal not Goal once(Goal) % once(X) :- call(X), !. forall(Generator, Test) % forall(G,T) :- \+ (G, \+ T). are handled by the *interpreter*, and MUST have :-public declarations. Omitting these declarations is a very common mistake. You don't learn to avoid it, but you learn to look for it when your compiled program doesn't work but it did when interpreted. If you haven't got a compiler, the problem doesn't arise, and there are a couple of other compilers than DEC-10 Prolog which use a different interface between compiled and interpreted code (in fact they are just incremental compilers), and again the problem doesn't arise.