[comp.lang.prolog] An example from "Knowledge Systems & Prolog"

ok@quintus (Richard A. O'Keefe) (07/01/88)

Yesterday I bought a copy of
	[A Logical Approach to Expert Systems and Natural Language Processing]
	Knowledge Systems and Prolog
	Adrian Walker, Michael McCord, John Sowa, & Walter Wilson
	Addison-Wesley 1987
	ISBN 0-201-09044-9

I haven't had time to read much of it yet.
There's some rather good advice in it, and it is easily the most
powerful argument I have ever seen for Edinburgh syntax.

For now I'd just like to comment on one particular example, found
found on page 413.  I'll transliterate it into Edinburgh syntax.

%   most_general_terms(Terms, GeneralTerms)
%   is true when GeneralTerms is the set of terms in Terms which are
%   subsumed by no other term in Terms.

most_general_terms(Terms, GeneralTerms) :-
	qsort(Terms, TermsInOrder),
	most_general_terms_in_order(TermsInOrder, GeneralTerms).

%   most_general_terms_in_order(TermsInOrder, GeneralTerms)
%   is just like most_general_terms/2, except that less general terms
%   precede more general terms in TermsInOrder (that is, if X and Y
%   are both in TermsInOrder and X subsumes Y, X must follow Y).  The
%   order of terms in the result is the same as the order in the input.

most_general_terms_in_order([], []).
most_general_terms_in_order([Term|Terms], GeneralTerms) :-
	member(MoreGeneral, Terms),
	subsumes(MoreGeneral, Term),
	!,
	most_general_terms_in_order(Terms, GeneralTerms).
most_general_terms_in_order([Term|Terms], [Term|GeneralTerms]) :-
	most_general_terms_in_order(Terms, GeneralTerms).

%   This version of quicksort is supposed to order its input so that
%   if X and Y are both given and X subsumes Y, X will follow Y in
%   the output.

qsort(Terms, TermsInOrder) :-
	qsort(Terms, TermsInOrder, []).

qsort([], L, L).
qsort([Term|Terms], L0, L) :-
	partition(Terms, Term, Before, After),
	qsort(Before, L0, [Term|L1]),
	qsort(After, L1, L).

partition([], _, [], []).
partition([Term|Terms], Pivot, Before, [Term|After]) :-
	subsumes(Term, Pivot),
	!,
	partition(Terms, Pivot, Before, After).
partition([Term|Terms], Pivot, [Term|Before], After) :-
	partition(Terms, Pivot, Before, After).

%---- end of first version

Now, my antipathy to (falsely so-called) "quicksort" is well known by now.
But in this case its quadratic worst case hardly matters, because if there
are N terms initially, most_general_terms_in_order/2 will do O(N**2) calls
to subsumes/2 anyway.  So we might as well do

most_general_terms(Terms, GeneralTerms) :-
	most_general_terms(Terms, [], GeneralTerms).

%   At each call of most_general_terms(T, G, X)
%   there is a list L such that append(L, T, Terms) and
%   most_general_terms(L, G).

most_general_terms([], GeneralTerms, GeneralTerms).
most_general_terms([Term|Terms], GeneralTerms0, GeneralTerms) :-
	knock_out(GeneralTerms0, Term, GeneralTerms1),
	most_general_terms(Terms, GeneralTerms1, GeneralTerms).

knock_out([], Pivot, [Pivot]).
knock_out([Term|Terms], Pivot, GeneralTerms) :-
	subsumes(Pivot, Term),
	!,
	knock_out(Terms, Pivot, GeneralTerms).
knock_out([Term|Terms], Pivot, [Term|Terms]) :-
	subsumes(Term, Pivot),
	!.
knock_out([Term|Terms], Pivot, [Term|GeneralTerms]) :-
	knock_out(Terms, Pivot, GeneralTerms).

%---- end of second version

This has the nice property that the order of terms in GeneralTerms is
exactly the same as the order of terms in the original Terms list,
apart from the deletion of subsumed terms.  It does at most N(N-1)
calls to subsumes/2, and while the version of Walker et alia does
at most (1/2)N(N-1) such calls in most_general_terms_in_order/2,
it may do a similar number in partition/4.  Indeed, because subsumes/2
is a partial order (not a total order), it is likely to fail rather more
often than it succeeds, so partition/4 is likely to put most of its
first argument into the Before list, and quadratic behaviour is very
likely.  (In fact, whenever subsumes(Term, Pivot) succeeds, that tells
us that Pivot will not be in the final result, so we might as well drop
it.)

Actual measurements show that the two versions do about the same number
of calls to subsumes/2: both tend to be close to N(N-1) calls.  Sometimes
the "unsorted" method does much better than the "sorted" one, sometimes it
does a little worse.


This is actually a very interesting problem.  It crops up in all sorts of
guises.  I currently have an algorithm which does at most N(N-1) calls to
subsumes/2 and reduces to at most 2(N-1) when subsumes/2 happens to be
total.  If anyone knows of a better algorithm I would be _very_ interested
to hear of it.

micha@ecrcvax.UUCP (Micha Meier) (07/07/88)

In article <149@quintus.UUCP> ok@quintus (Richard A. O'Keefe) writes:
>This is actually a very interesting problem.  It crops up in all sorts of
>guises.  I currently have an algorithm which does at most N(N-1) calls to
>subsumes/2 and reduces to at most 2(N-1) when subsumes/2 happens to be
>total.  If anyone knows of a better algorithm I would be _very_ interested
>to hear of it.

	I actually have no better algorithm, but I've implemented
	a predicate compare_instances(Order, Term1, Term2) which is
	similar to compare/3, and this makes it possible to get the
	relation between two terms using only one call (which makes
	only one pass through the terms). This obsoletes any sorting
	of the initial list. I'd be interested as well in an algorithm
	that performs better than the blind one.


--Micha