ok@quintus.UUCP (Richard A. O'Keefe) (05/18/88)
Contents. 1. Apology. 2. Restatement of problem. 3. Response to Saumya Debray. 4. Response to Lee Naish. 5. "Nondeterminism." 1. Apology. I did two things that seem to have proven confusing. First, in my haste to find a short "subject" line I picked "clause fusion". As Saumya Debray pointed out, the problem is not with fusing the clauses as such, but with combining the two calls to pop_stack/4. A much better name for this would be "factoring", the optimisation in question is fundamentally an application of the distributive law P & Q | P & R <=> P & (Q | R) and the snag is that the corresponding transform for Prolog P, Q ; P, R --> P, (Q ; R) [&->] is not valid. The second confusing thing I did was to show a (slightly modified) version of the actual case, instead of boiling it down to [&->]. While the actual code has lots of cuts, for once the cuts have nothing to do with the problem. 2. Restatement of problem. The problem is that the distributive laws of logic: (P & Q) | (P & R) <=> P & (Q | R) (P | Q) & (P | R) <=> P | (Q & R) correspond to obvious rewrite rules for transforming Prolog code: [&->] P, Q ; P, R --> P, (Q ; R) [&<-] P, (Q ; R) --> P, Q ; P, R [|->] (P;Q), (P;R) --> P ; (Q, R) [|<-] P; (Q, R) --> (P;Q), (P;R) each of which has uses in source-to-source optimisation and none of which is valid. [Note that the importance of distinguishing clearly between the logical operators "&" and "|" and the control structures "," and ";" is my reason for regarding the use of "&" and "|" in Prolog dialects as a detestable confusion rather than a feature, and that explicitly includes the use of "|" in DEC-10 Prolog.] My earlier message showed why it is necessary that P should not be side-effective and why (in the presence of cuts) P should be determinate. But here is an even simpler example. p(a). p(b). r(b). p(c). p(d). q(d). The query p(X), q(X) ; p(X), r(X) has solutions X=d and X=b, and they are found in that order. The query p(X), (q(X) ; r(X)), however, has the solutions X=b and X=d, the same solutions as the former query, but found in the opposite order. [Would Paul Voda care to comment on the interaction of this snag with the first-solution construct of Trilogy? Presumably the Trilogy compiler is able to determine which predicates are in the dynamic scope of such constructs and inhibit optimisations like this for them? ] 3. Response to Saumya Debray. In his first message, he says: > STEP 2: Assume that association/4 is always called with its > last argument free. This allows us to delay the unification > "Result = present" in the second disjunct, ... Unfortunately, this isn't true. Recall that we have association(Stack, Key, Val, Result) :- ... ; Result = present, pop_stack(Key1, Val1, Stack, Stack1), ... and he wants to delay the unification. This is only valid if we know that Result is not a variable appearing in Stack, or that if it is, pop_stack/4 doesn't depend on it. Plaisted and Mellish have both published algorithms for estimating variable sharing, so it is plausible that a really good compiler might have the information which legitimates this transformation. > STEP 3: Assume pop_stack/4 is free of side effects. This allows us to > factor the literal for pop_stack/4 ... As I showed above, and as I thought I made clear in my original message, this isn't true. pop_stack/4 *as called* has to be determinate. > STEP 4: Richard assumed, in addition, that association/4 is always called > with its first three arguments ground terms. Looking at the code, it seems > to me that the predicate is intended to search a (given) stack for a value > associated with a (given) key, so I think it's more reasonable to assume > that the third argument to association/4 is also free in each call to it. Well, I only made that assumption for the sake of simplifying the discussion. In fact Debray's assumption must be wrong, the code not only looks for the key but checks to see whether the associated value matches (present) or clashes with (conflict) the GIVEN Val. That unification *must*not* be delayed, because it conveys important control information. 'conflict' is intended to mean "there IS an entry for Key in Stack, but Val didn't match." [Admittedly, that isn't what the original version of the code did, but the author of the original code also proposed the revised version, and seemed to think that it matched his intentions.] Suamya Debray's second message drew the distinction between clause fusion and other stuff, and pointed out that I wasn't talking about clause fusion as such. For this I have apologised above. In his third message, he warned about "assert". I have been told that the program in which the original code appeared does no asserts. 4. Response to Lee Naish. Could you tell us more about what declaring a predicate "pure" in NU-Prolog means and does? How much is assumed, how much checked? 5. "Nondeterminism." We saw in section 2 that even when the transforms preserve the logical reading of a formula, they may not preserve the order in which solutions are found. If the caller of such a formula does not care what order the results are generated in, then these rules can be applied even to formulas with multiple solutions, as long as they are free of side effects. P, Q, and R may make use of cuts and meta-logical operations such as var/1. If we accept this, then we have a new kind of "don't care" nondeterminism. [Actually, we already had it: the order in which built-in predicates such as current_atom/1, current_predicate/2, and current_op/3 enumerate their results has never been defined.] This is, however, _bounded_ nondeterminism, so standard methods for thinking about such programs as programs should apply, and the order was never supposed to matter in the logic. -- --- ARPANET: quintus!ok@SUN.COM | Logic programming: UUCP: ok@quintus.uucp | Prolog is only the beginning. ...!{sun,pyramid}!quintus!ok | Elegance is not an option!