[comp.lang.prolog] Propositional Theorem Prover

anv@bnr-di.UUCP (Andre Vellino) (05/06/89)

About a year ago there was some discussion on the net which left the 
impression that Prolog was not as good a language as C for writing 
resolution theorem provers.  This impression was due to the 
benchmark results of two implementations (one in C and one in 
Prolog) of the same "set-of-support strategy" resolution algorithm.  
On a Sun-3, some contradictions took the C version about 1.4 seconds 
to refute whereas the Prolog version took between 30 seconds and 2 
minutes.  Some optimizations were done on the Prolog code by a 
number of people and the discrepancy was reduced by a factor of 5, 
but the resulting code was still about an order of magnitude slower 
than the C version.  For another contradiction, the propositional 
formulas that express the idea that 5 objects can't fit into 4 holes, 
which is known to belong to a class of problems that are hard for 
*every* resolution proving method (Haken 1985), the C version took 
between 10 and 15 minutes on a Sun-3.  The Prolog version took too 
long to be worth measuring.

The moral of the story drawn by the implementors was that since 
the C and Prolog versions of this algorithm performed the same 
computations, something about Prolog, viz. the lack of destructive 
assignment, was probably at fault.

The purpose of the exercise, of course, was not to write the best 
theorem prover in each language but to compare similar operations 
in two languages that have radically different programming models.  
However, I would suggests that another conclusion be drawn: 
different programming paradigms demand different algorithms.  
While this is perhaps obvious to some, I thought that the following 6-
line theorem prover in (pure) Prolog nicely illustrates the point.

For the examples that were used in the benchmark exercise, this 
analytic tableau theorem prover is several orders of magnitude 
faster than even the 'C' version of the set-of- support resolution 
prover.  However, it is not necessarily a better theorem proving 
method.  I know that there are examples which are exponentially  
harder for Analytic Tableau than for Resolution (by which I mean 
that the number of lines in the resolution proof, as opposed to the 
number of computations needed to find it, is vastly shorter than the 
size of the shortest analytic tableau).  So I wouldn't at all be 
surprized if there are some examples that defeat this analytic 
tableau program but are feasibly provable using set-of-support 
resolution.

The idea behind this theorem prover is a variation on the Analytic 
Tableau method (Smullyan 1968).  Since it is the variation on this 
method that makes this program relatively efficient, it deserves 
some explanation.  

A conjunction of disjunctive formulas is satisfiable if each disjunction 
can be satisfied.  We represent each disjunction as a list of literals 
and the conjunction as a list of disjunctions. For each disjunctive 
formula an attempt is made to provide a satisfying assignment to at 
least one propositional variable in that formula.  If we represent 
each propositional variable by a (first-order) logic variable in Prolog, 
this truth-value assignment is propagated to all the occurrences of 
this propositional variable in every formula simply by unification.

A formula is disjunction (list) of literals of the form              
literal(NegOrPos,Variable) where NegOrPos indicates the presence or 
absence of a negation sign in front of the Variable.  For example the 
clause ((a v ~b) & (b v ~c) is represented as         
[[literal(true,A),literal(false,B)],[literal(true,B),literal(false,C)]].   A 
formula is satisfied if a variable can be unified with its prefix (true 
or false). 

The first two lines are easy.  A conjunction (list) of formulas is 
satisfiable if each formula can be satisfied.

satisfiable([]).
satisfiable([F|Formulas]) :- satisfied(F),satisfiable(Formulas).

The naive program to check that a disjunctive formula is satisfied is 
analogous to member/2:

satisfied([literal(T,T)|_]).         % the first disjunct is satisfied
satisfied([_|R]) :- satisfied(R).    % or one of the rest is

(This program, incidentally, is equivalent in complexity to 
L. Pereira's theorem prover in "Prolog by Example" by Coelho and 
Cotta (1988) a version of which is reproduced at the end of this 
article.)

There are two major problems with this program:
(1) it will explore unnecessary truth-value assignments for clauses 
that are already true in virtue of a previous truth-value assignment 
to another literal.
(2) it does not take advantage of the fact that once it has shown that 
a truth-value assignment for a literal in a clause cannot be used to 
satisfy the other clauses then only its negation can.

The beauty of it is that *both* these optimizations are 
implemented immediately by the following modifications:

satisfied([literal(T,T)|_]).
satisfied([literal(TV,V) | Literals]):- negation(TV,V),satisfied(Literals).

negation(true,false).
negation(false,true).

If the attempt to satisfy a disjunctive formula (the first clause for 
satisfied/1) fails then the literal is falsified (the truth-value 
reversed) and another literal in that formula must be satisfied.  Thus, 
if the second clause to satisfied/1 is called with a variable already 
bound to a satisfying assignment, then looking at the others is 
unnecessary, which is guaranteed by the failure of the guard 
negation/2 (optimization 1).  And if satisfied/1 was originally called 
with an unbound variable and backtracking causes the second clause 
for satisfied to be called, then the literal can have its truth-value 
reversed by the call to negation/2 (optimization 2).

Prolog systems with indexing may benefit further by folding 
in negation/2 into the head of satisfy/1, 

satisfied([literal(T,T)|_]).
satisfied([literal(true,false) | Literals]):- satisfied(Literals).
satisfied([literal(false,true) | Literals]):- satisfied(Literals).

making it a 5 line program.

It turns out that, for minimally inconsistent sets of formulas like the 
5 objects in 4 holes problem, this program is a couple of orders of 
magnitude faster (about 10 seconds on a Mac-II using a 5K-lips 
Prolog) than the C program that implements the set-of-support 
resolution prover on a Sun-3.  Of course, they _are_ different 
algorithms and this one is not optimized to handle non-minimally 
inconsistent sets.

What this may illustrate is that while a procedural language like C 
probably lends itself well to deterministic algorithms like set-of-
support resolution, Prolog is better suited to non-deterministic 
semantic techniques.

BIBLIOGRAPHY

Smullyan, R., (1968). _First Order Logic_ Springer-Verlag, New 
York,.Haken, A. (1985).  "The Intractability of Resolution" 
_Theoretical Computer Science_ 39 pp.297-308.

Andre Vellino
Computing Research Laborartory           TEL: (613) 763-7514
Bell-Northern Research                   FAX: (613) 763-4222
P.O. BOX 3511 Station C                  BITNET: vellino@bnr.ca
Ottawa, Ontario                          UUCP: utzoo!bnr-vpa!bnr-di!anv
Canada K1Y 4H7                                 @ ai.utoronto

% --------------------------- cut here ------------
satisfiable( [] ).
satisfiable([F|Formulas]) :- satisfied(F), satisfiable(Formulas).

satisfied([literal(T,T)|_]).
satisfied([literal(TV,V) | Literals]):-negation(TV,V),satisfied(Literals).

negation(true,false).
negation(false,true).

%%%    5 objects can't fit into 4 holes !
formulas( [ 
  [literal(true,_a),literal(true,_b),literal(true,_c),literal(true,_d)],
  [literal(true,_e),literal(true,_f),literal(true,_g),literal(true,_h)],
  [literal(true,_i),literal(true,_j),literal(true,_k),literal(true,_l)],
  [literal(true,_m),literal(true,_n),literal(true,_o),literal(true,_p)],
  [literal(true,_q),literal(true,_r),literal(true,_s),literal(true,_t)],
  [literal(false, _a), literal(false, _e)],
  [literal(false, _a), literal(false, _i)],
  [literal(false, _a), literal(false, _m)],
  [literal(false, _a), literal(false, _q)],
  [literal(false, _b), literal(false, _f)],
  [literal(false, _b), literal(false, _j)],
  [literal(false, _b), literal(false, _n)],
  [literal(false, _b), literal(false, _r)],
  [literal(false, _c), literal(false, _g)],
  [literal(false, _c), literal(false, _k)],
  [literal(false, _c), literal(false, _o)],
  [literal(false, _c), literal(false, _s)],
  [literal(false, _d), literal(false, _h)],
  [literal(false, _d), literal(false, _l)],
  [literal(false, _d), literal(false, _p)],
  [literal(false, _d), literal(false, _t)],
  [literal(false, _e), literal(false, _i)],
  [literal(false, _e), literal(false, _m)],
  [literal(false, _e), literal(false, _q)],
  [literal(false, _f), literal(false, _j)],
  [literal(false, _f), literal(false, _n)],
  [literal(false, _f), literal(false, _r)],
  [literal(false, _g), literal(false, _k)],
  [literal(false, _g), literal(false, _o)],
  [literal(false, _g), literal(false, _s)],
  [literal(false, _h), literal(false, _l)],
  [literal(false, _h), literal(false, _p)],
  [literal(false, _h), literal(false, _t)],
  [literal(false, _i), literal(false, _m)],
  [literal(false, _i), literal(false, _q)],
  [literal(false, _j), literal(false, _n)],
  [literal(false, _j), literal(false, _r)],
  [literal(false, _k), literal(false, _o)],
  [literal(false, _k), literal(false, _s)],
  [literal(false, _l), literal(false, _p)],
  [literal(false, _l), literal(false, _t)],
  [literal(false, _m), literal(false, _q)],
  [literal(false, _n), literal(false, _r)],
  [literal(false, _o), literal(false, _s)],
  [literal(false, _p), literal(false, _t)]   ] ). 

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Truth-table technique used in L. Pereira's program (1976)

sat((A & B)) :-    sat(A), sat(B).
sat((A ; B)) :-    sat(A) ; sat(B).

sat(pos(true)).
sat(neg(false)).

bimbart@kulcs.uucp (Bart Demoen) (05/19/89)

To my surprise, I have not yet seen any reply to the contribution of
Andre Vellino  <157@bnr-di.UUCP> (perhaps because American people do not
bother to distribute their contribution to Europe)

Let me quote Andre Vellino's conclusion:

> What this may illustrate is that while a procedural language like C 
> probably lends itself well to deterministic algorithms like set-of-
> support resolution, Prolog is better suited to non-deterministic 
> semantic techniques.

If 'better suited' means that you can't write a faster C program implementing
the same non-deterministic algorithm,  then this is wrong: I have a 60 line C
program without tricks that is 5 times faster than the Prolog program presented
by Andre Vellino. I must admit though that my knowledge of what the WAM does to
the Prolog program, helped me in writing the C version.

If 'better suited' just means that it is easier to program in Prolog than in C,
then the article was an overkill: to arrive at this conclusion, there was no
need to have a non-deterministic theoremprover in Prolog that could perform a
certain proof FASTER than deterministic theoremprover in C !

I think the conclusion should have been that deterministic algorithms perform
better on some problems and non-deterministic algorithms perform better on
other problems ... which isn't new either.

The Prolog program can be sped up by a factor of 2 easely by changing the
representation of the formulas and by a small folding exercise.

bimbart@kulcs

anv@bnr-di.UUCP (Andre Vellino) (05/24/89)

Bart Demoen is quite right: my conclusion wasn't strong enough.  
What I should have concluded was that the C vs. Prolog debate had 
not been fair to Prolog.  Correct me if I am wrong, but I believe the C 
program was written first and the algorithm that it implemented, 
data-structures and all, were then translated into Prolog (with a 
not-terribly surprising drop in performance).  This didn't seem to me 
to be the right approach for comparing languages: Prolog is bound to 
lose in this sort of comparison if only because of the extra overhead.  

It seems to me that if you are going to compare two programming 
languages so as to provide reasons for preferring one to the other, 
you should measure how they compare at solving the same 
*problems*, not which performs faster with the same algorithm and 
the same data structures.  This is especially true when comparing 
two programming languages with different conceptual models.

In fact, what my little program really shows is that the improved 
analytic tableau method has a smaller search space (for some 
examples anyway) than set-of-support resolution (which spends a lot 
of its time doing useless subsumption checking).  But the point is, by 
Bart Demoen's own admission, one is more likely to discover this 
tableau method by thinking about how best to solve the problem in 
Prolog (and come up with a 6-line program) than one is to come up 
with a (60-line) C program that does the same thing 5 times faster.

I should add that I have nothing against the Prolog set-of-support 
theorem prover as a benchmark for comparing Prolog systems 
against each other.  I just wanted to contest the claim that the 
problem with Prolog is that it doesn't have a destructive assignment.
---------------------------------------------------------------------
Andre Vellino
Computing Research Laborartory           TEL: (613) 763-7514
Bell-Northern Research                   FAX: (613) 763-4222
P.O. BOX 3511 Station C                  BITNET: vellino@bnr.ca
Ottawa, Ontario                          UUCP: utzoo!bnr-vpa!bnr-di!anv
Canada K1Y 4H7                                 @ ai.utoronto
-----------------------------------------------------------------------
Disclaimer: the views expressed in this article are not necessarily endorsed
	    by BNR, ... etc.

mccaugh@s.cs.uiuc.edu (05/27/89)

 Since the consensus is that Prolog is "less efficient" than C, why consider a
 program written in Prolog any more acceptable a solution than BubbleSort is,
 vs. QuickSort? From my experience in "software shops", I can assure you that
 efficiency is a fundamental issue. A more salient point to take up would
 seem to be: how long does it take a programmer to craft a solution in Prolog
 vs. one in C? Probably a lot less time (at the risk of belaboring the obvious)
 and if architectures ever evolve to the Japanese plans for the 5th Generation,  then whither efficiency?