PROLOG-REQUEST@SU-SCORE.ARPA (Chuck Restivo, The Moderator) (08/15/86)
PROLOG Digest Friday, 15 Aug 1986 Volume 4 : Issue 38 Today's Topics: Puzzle - Lemmas & Searches, Implementation - Parsing `->' ---------------------------------------------------------------------- Date: 13 Aug 86 12:45:36 GMT From: David Sherman <mnetor!lsuc!dave@seismo.css.gov> Subject: Lemmas Mario O. Bourgoin writes: >If Prolog refused to solve for goals already encountered, >this problem would not exist. A hash table of calls would >provide an efficient solution to the problem of whether >a state has been met already given a consistent encoding of >variable names. Can someone provide me with a reason other >than efficiency for why this is not done? I'd be interested in hearing the answer to this too. I believe this is what Kowalski describes as "lemmas" in Logic for Problem Solving - that is, whenever a goal is resolved (to either yes or no), that fact can be recorded. I suppose it might be possible to program this into Prolog without modifying Prolog itself - i.e., with predicates. Has anyone done this? (It's of interest to me for the income tax analysis system I'm working on.) One obvious problem that I can see is what to do with assert and retract. For the lemma-system to be correct, it would have to be able to figure out the implications of calls to assert and retract, *including* how these affect predicates several calls away. For example: tired(today) :- toomuch(netnews, yesterday). toomuch(netnews, Day) :- read(netnews, Minutes, Day), Minutes > 120. read(netnews, 180, yesterday). ?- tired(today). yes. retract(read(netnews, 180, yesterday)). ?- tired(today). If the implementation of retract has to search the entire database and start making changes to what's been put into the lemma hash table, the whole point of the system (improving efficiency) could be defeated. Comments? -- David Sherman ------------------------------ Date: Thu, 14 Aug 86 15:15:38 pdt From: Allen VanGelder <avg@diablo> Subject: Exponential searches in the Farmer problem Mario Bourgoin made a very interesting observation in his summary of the Farmer problem. > The number of repetitions has an upper bound of 2^58 as near > as I can calculate. The 58 was obtained by counting the > number of legal second states produced by Prolog using > linked/2 and subtracting the total number of legal > states from the result. Can someone substantiate this answer? This is not correct. 2^58 is the number of nodes in a balanced BINARY tree of DEPTH 58. The search tree for "Farmer" is neither balanced nor binary, nor is its depth 58. However, the size of the search tree IS exponential in the number of legal states. The analysis is a bit involved. Many readers may wish to skip to "-------" at this point. The general rule is that if the search has b branches at each node and (always) goes to depth d, then there are about b^d nodes. In these path/state problems, the branching factor b corresponds to the number of states adjacent to a given state, and the depth d corresponds to the path length, which is bounded by the number of legal states if the search is restricted to simple paths. Mario appears to be estimating b = 58, so it should be in the base, not the exponent. However, not all searches go to the same depth, so it is not clear what the correct estimate of d is. Taking the average is WRONG: consider a search tree that looks like a long list of length d. Then b=2 and the average depth is d/2, but the number of nodes is 2d - 1, which is linear, and NOT the exponential 2^(d/2). In any event, assume there are 10 DISTINCT legal states, but 58 apparent states due to redundancy in the way they are specified. If the program allows only simple paths, no path is longer than 9, and we get 58^9 as a crude upper bound. That can be improved to 43 * 18^9 by assuming the redundancy factor is always 6. (See below.) Well, that number is still quite unacceptable, and shows the high price of redundancy. Without any redundancy in the statement of the 10 legal states, 8 * 3^9 looks likes an upper bound, assuming the (naive) rules canGo(Goal, Goal, _). canGo(S1, Goal, Visited) :- legal(S2), linked(S1, S2), not member(S2, Visited), canGo(S2, Goal, [S2 | Visited]). (These rules assume "legal" is a "generator" and "linked" is a "tester". If "linked" can be formulated as a "generator", we may be able to do better. See below.) I got the 8 * 3^9 estimate as follows: from each internal node in the search, there are 10 branches to try -- 1 per legal state. In this problem, exactly 4 are linked to S1, the current state. Thus 7 branches fail immediately (i.e. within the rule): 6 because they are not linked to the current state, and (at least) 1 because it goes back to the state visited just before S1. (Notice how all the "domain knowledge" enters the analysis; we are using the knowledge that the links are symmetric.) So charge the node 8, 1 for itself and 1 for each of its predictable failures. The "true" branching factor is at most 3. If legal states are specified redundantly by a factor of 6, then we multiply both the 7 and the 3 by 6, and get 43 * 18^9, which is about 6^10 times the non-redundant figure. Actually, Mario's final program is even better than (1+7) * 3^9 because he is able to make "linked" a "generator" and eliminate the "legal" subgoal. Thus, the estimate is (1+1) * 3^9 because he generates 4 LINKED states and fails on (at least) 1. Thus careful programming saves a factor of 4 right there. While the search tree is actually much smaller because of the regular structure of "linked", the bound of 2 * 3^9 is about 40,000. This illustrates the danger that search algorithms will get out of hand. Recall that if there are d states and the branching factor is b, we get an estimate of b^d nodes. This is huge for moderate sized d, even if b is only 2 or 3. To achieve a quantum jump in (guaranteed) efficiency, we need to switch to a depth first search (or breadth first search) that remembers ALL the visited nodes, not just the ones on the currently successful path. While this can easily be done (for bounded degree graphs) in linear time in a language with arrays and assignments, in Prolog it is not so easy, and may take quadratic time. If you use "assert" to remember, it depends on how fast the interpreter can retrieve an asserted fact, something over which the user has no control. Even so, d^3 is less than 2^d for problems beyond "toy" size. ------------------------------ Date: 13 Aug 1986 07:20-EST From: Saumya Debray <debray%suny-sb@csnet-relay> Subject: Parsing Prolog's -> I got the following message from Fernando Pereira, which I think is a reasonable defense for the way Prolog's -> is currently parsed: From: Fernando Pereira <pereira%sri-stinson.arpa> Subject: Manipulating Programs with `->' The explanation in terms of cut was just a "crutch". Maybe a better explanation is in terms of a guarded command. The goal ( C1 -> G1; ... ; Cn -> Gn ) is analogous to the Dijkstra guarded command [ C1 -> G1 [] ... [] Cn -> Gn ] and also to the Lisp COND (COND ((C1) G1) ... ((Cn) Gn)) The alternative syntax makes the goal analogous to [ C1 -> G1 [] true -> [ C2 -> G2 [] true -> ... [ Cn -> Gn ]...] which is a different beast altogether. The latter reading is fatally sequential, whereas the former has at least a potential for parallel evaluation of the guards. -- Fernando ------------------------------ End of PROLOG Digest ********************