[comp.lang.prolog] Novice RETRACT question

ssmith@ncdel.enet.dec.com (Sheldon E. Smith) (07/31/90)

Hello.  I'm teaching myself Prolog (Why? Because I *want* to.   8^}   ) 
I'm playing with a program that generates a *load* of facts.  Many of
the facts are "time-stamped", so quite a few facts are duplicate or obsolete. 

While I have the _Clocksin & Mellish, 2nd ed._ book, I haven't found a
*really good* tutorial.  What I'd *really* like is a book of Prolog
algorithms, but I'll defer *that* to another article.

In any event, I have facts of the form
	object(Time,X,Y)
where X and Y are locations, and Time is the time-stamp when the object
was observed.  That is, given the facts
	object(1,12,34).
	object(2,12,34).
	object(3,12,34).
I'd like to retract "object(1,12,34)" and "object(2,12,34)".

I'd have thought that something like
reduce_objects :- object(I,X,Y),object(J,X,Y),
		I < J,
		retract(object(I,X,Y)).

What am I doing wrong?  Am I even on the right track?

Thanks in advance for any help,
------------------------------------------------------------------------
---------
Sheldon E. Smith			!Email:
Digital Equipment Corporation		! ssmith%ncdel.enet.dec@decwrl.dec.com
Minneapolis, Minnesota			! ...!decwrl!ncdel.enet.dec.com!ssmith
========================================================================
=========

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (07/31/90)

In article <1939@engage.enet.dec.com>, ssmith@ncdel.enet.dec.com (Sheldon E. Smith) writes:
> What I'd *really* like is a book of Prolog
> algorithms, but I'll defer *that* to another article.
(Hmm.  What would you pay for it?  What do you want to be in it?)

> In any event, I have facts of the form
> 	object(Time,X,Y)
> where X and Y are locations, and Time is the time-stamp when the object
> was observed.  That is, given the facts
> 	object(1,12,34).
> 	object(2,12,34).
> 	object(3,12,34).
> I'd like to retract "object(1,12,34)" and "object(2,12,34)".

This is where data base references come in extremely handy, if
your Prolog system supports them.  Suppose you want to retract
everything before a particular Time:

	retract_objects_before(Time) :-
		forall((
		    clause(object(T,_,_), _, Ref),
		    T < Time
		), (
		    erase(Ref)
		)).

If your Prolog lacks the forall/2 structure, you might do

	retract_objects_before(Time) :-
		\+ (
		    clause(object(T,_,_), _, Ref),
		    T < Time,
		    \+ erase(Ref)
		).

which comes to the same thing.  (The jargon for this kind of structure
is a ``failure-driven loop''.)

Something marginally cleaner employs an auxiliary predicate:

	erase_refs([]).
	erase_refs([Ref|Refs]) :-
		erase(Ref),
		erase_refs(Refs).

which you only have to write once, and then uses a call to setof/3:

	retract_objects_before(Time) :-
		setof(Ref,
		    X^Y^B^( clause(object(T,X,Y), B, Ref), T < Time ),
		    Refs),
		erase_refs(Refs).

Another approach might be to write an erase_all/2 looking rather like
findall/3 but erasing things instead of returning them.  (Algol 60
fans:  do you recognise Jensen's device?)

	erase_all(Ref, Generator) :-
		forall(Generator, erase(Ref)).

and then call

	erase_all(Ref, /* where */
		( clause(object(T,_,_), _, Ref), T < Time ))

> I'd have thought that something like
> reduce_objects :- object(I,X,Y),object(J,X,Y),
> 		I < J,
> 		retract(object(I,X,Y)).

> What am I doing wrong?

Not a lot, actually, except for ugly layout.  The main problem
is that retract/1 removes *ONE* clause each time it succeeds; here
you want to remove ALL the old clauses.  Your code could be
reshaped a bit to

	retract_old_objects :-
		object(J, X, Y),	% for each timestamped object
		object(I, X, Y),	% for each object at that same place
		I < J,			% if I is older than J
		retract(object(I,X,Y),	% delete it
		fail			% and look for some more
	    ;	true.			% succeed when we've run out.

I'm a wee bit unhappy about this.  Surely an object ought to have an
identity of its own, other than its location at some time.  How do you
say that the object which is at (27,33) at time 98 is the same object
as the object which is at (28,32) at time 99?

A rule of thumb for checking that your program makes sense is that
whenever you have a call to retract/1
    - either it is in a loop
    - or you have identified a UNIQUE clause you want to remove
-- 
Science is all about asking the right questions.  | ok@goanna.cs.rmit.oz.au
I'm afraid you just asked one of the wrong ones.  | (quote from Playfair)

milgr@teapot.prime.COM (Marc Milgram) (08/01/90)

In article <1939@engage.enet.dec.com>, ssmith@ncdel.enet.dec.com
(Sheldon E. Smith) writes:
|>I'm playing with a program that generates a *load* of facts.  Many of
|>the facts are "time-stamped", so quite a few facts are duplicate or
obsolete. 

|>reduce_objects :- object(I,X,Y),object(J,X,Y),
|>		I < J,
|>		retract(object(I,X,Y)).


Alternately, you could do this at assert time with:

assert_unique_object(object(I, X, Y)):-
	not(( retract(object(_, X, Y), fail)),
	assert(object(I, X, Y)).

|>------------------------------------------------------------------------
|>---------
|>Sheldon E. Smith			!Email:
|>Digital Equipment Corporation		! ssmith%ncdel.enet.dec@decwrl.dec.com
|>Minneapolis, Minnesota			! ...!decwrl!ncdel.enet.dec.com!ssmith
|>========================================================================
|>=========
                                

Marc Milgram		Email: milgr@teapot.prime.com
Prime Computer
Framingham, MA

umace03@doc.ic.ac.uk (M Y Ben Gershon) (08/01/90)

> Hello.  I'm teaching myself Prolog ...
>   ...
> While I have the _Clocksin & Mellish, 2nd ed._ book, I haven't found a
> *really good* tutorial.

Why not try 'The Art of Prolog', by Leon Sterling and Ehud Shapiro,
published by MIT Press.  Very good, sets a high standard whilst remaining
readable, and has lots of good examples.

Michael Ben-Gershon
umace03@doc.ic.ac.uk

ssmith@ncdel.enet.dec.com (Sheldon E. Smith) (08/02/90)

Thanks to all who've responded (both here and via E-mail).   Yes, it
occurred to me that evening (after posting) that I'd also need a
backtracking loop.  I checked the following morning, and, sure enough,
it *had* worked; I simply had too many facts to see that one was gone. 
BTW, I'm using the Edinburgh C-Prolog interpreter.
I have it running now.  FWIW, I have a *large number* of Object rules,
of which about 20% are obsolete.  It's taking about two hours of CPU
time on a 8820 to reduce.  Would the usage of SETOF/BAGOF be faster?

A couple of you mentioned the use of SETOF or BAGOF as an alternative
method of locating the redundant facts.  How can I put this: "That went
*way* over my head", or to summarize, "Huh??"  I'm still trying to
understand SETOF/BAGOF.  Perhaps I've had too many years of structured
(3rd generation) programming, or maybe it's just that I can be *dense*!   8^)
One proffered example (that I *almost* understand) was from Bart Demoen:
	reduce_objects :- bagof(object(I,X,Y), J^(object(I,X,Y), object(J,X,Y),
I<J), L),
			delete_objects_in_list(L).
	delete_objects_in_list([]).
	delete_objects_in_list([Obj|T]) :- retract(Obj), delete_objects_in_list(T).
I *do* understand what "delete_objects_in_list" is doing.   8^)
Would some *kind* soul translate/comment *profusely* the BAGOF call?

Thanks in advance,
------------------------------------------------------------------------
---------
Sheldon E. Smith			!Email:
Digital Equipment Corporation		! ssmith%ncdel.enet.dec@decwrl.dec.com
Minneapolis, Minnesota			! ...!decwrl!ncdel.enet.dec.com!ssmith
========================================================================
=========

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (08/03/90)

> I'm still trying to understand SETOF/BAGOF.

It will help youu a lot if you stop writing the names as if they
were variable names.  The operations are setof/3 and bagof/3.

> Perhaps I've had too many years of structured
> (3rd generation) programming,

What a strange use of the word "structured".  setof/3 and bagof/3 are
just loops that collect a result.  Having a loop that accumulates some
kind of result is just about the commonest thing there is in old-style
languages (apart from assignment statements and mistakes).

Start with findall/3:

	findall(Template, Generator, ListOfInstances) :-
		for each *proof* of Generator,
		    save away the corresponding instance of Template
		when no more proofs of Generator can be found,
		    pick up all the saved instances of Template
		    as a list, the elements occurring in the order
		    in which they were found,
		    and unify ListOfInstances with the result.

That's the operational end.
Or start with setof/3:

	setof(Template, Generator, Set) :-
		Set = {T | T is an instance of Template
			   which corresponds to a solution of Generator }
		and Set is not {}

That's the (nearly) logical end.
bagof/3 is a rather peculiar hybrid of findall/3 and setof/3 which should
only be used when you know that each proof of Generator is going to yield
a DIFFERENT instance of Template.

> One proffered example was from Bart Demoen:
> 	reduce_objects :-
>		bagof(object(I,X,Y),
>			J^( object(I,X,Y), object(J,X,Y), I < J ),
>			ListOfObjectInstances),
>		delete_objects_in_list(ListOfObjectInstances).
>
> 	delete_objects_in_list([]).
> 	delete_objects_in_list([Obj|T]) :-
>		retract(Obj),
>		delete_objects_in_list(T).

Given that you are using C Prolog, you should *NOT* use
delete_objects_in_list/1.  I'm sure it works *perfectly* in BIM Prolog,
but I can tell you that it will be very inefficient in C Prolog.

(1) When you call retract(object(137,12,10)), for example, C Prolog does
    no indexing at all, so C Prolog will do a LINEAR search through the
    whole table of object/3 facts looking for this one.

(2) What's more, because it doesn't do any indexing, C Prolog doesn't
    notice that there is only one fact that looks like that.  (Indeed,
    you have given _us_ no reason to believe that there is only one
    such fact.  You may possibly _intend_ that there should be only
    one such fact, but without seeing your program we can't be sure
    that it doesn't mistakenly create duplicate facts.)  So C Prolog
    leaves a choice point behind.  For EVERY fact that you delete this
    way, C Prolog is going to leave a choice point behind, ready to
    resume searching for other similar facts to delete.  You can fix
    that by doing

	retract_facts([]).
	retract_facts([Fact|Facts]) :-
		retract_first(Fact),
		retract_facts(Facts).

	retract_first(Fact) :-
		retract(Fact),		% *MUST* set up a choice point
		!.			% which *we* know isn't needed.

(3) Even if C Prolog _did_ do perfect indexing, and _did_ notice that
    there was only one matching fact, it would STILL have to leave a
    choice point behind, ready to resume its search and destroy task.
    Why?  Because of the semantics of assert/1 and retract/1 in DEC-10
    Prolog and C Prolog.  In those dialects, it is DEFINED that
	if you have a call to a dynamic predicate
	and that predicate is changed by assert/1 or retract/1
	then the predicate will notice the change AT ONCE
    So, if I do

	q(a).

	p :-
		retract(q(a)),
		write(retracted), nl,
		assert(q(a)),
		fail.

    and call

	?- p.

    DEC-10 Prolog and C Prolog are REQUIRED write "retracted" TWICE.
    The call to retract/1 is going to sit there and wait IN CASE another
    matching clause is SUBSEQUENTLY added.  So in C Prolog, *every*
    successful call to retract/1 *must* leave a choice point behind,
    however many potential matches for its argument happen to remain
    right at the moment.

(4) There's worse.  I posted in this newsgroup a solution similar to
    Bart Demoen's, except that I used erase/1.  There is an extremely
    good reason for that.

    DEC-10 Prolog and C Prolog use an implementation method called
    "structure sharing".  That means that when you pick up an instance
    of a clause (by calling the predicate, or by calling clause/[2,3]
    on it, OR BY CALLING retract/1 ON IT) the data structure which
    represents this instance is a mixture of information held on the
    stacks AND information in the clause itself (a so-called skeleton).
    Because part of the representation of this instance comes from the
    clause itself,

	in a structure-sharing implementation of Prolog,
	the space for a clause cannot be reclaimed as long as
	there is a term in the stacks which refers to it

    and

	retract/1 always creates a reference to the clause it is
	marking as unwanted

    so

	in DEC-10 Prolog and C Prolog, the success of

		retract(Fact)

	means that the clause which unified with Fact is *NAILED DOWN*
	and the space CANNOT be reclaimed until this call is failed back
	over.  This is because finding the clause to be marked involves
	UNIFYING Fact with the clause, and such a unification creates a
	reference to the clause.

    This means that if you use delete_objects_in_list/1 in C Prolog
							===========
    you are GUARANTEED that the space for these clauses CAN'T be
    reclaimed until you fail back over the call to delete_objects_in_list/1.

    So here is how to code that part of it so that it doesn't leave behind
    any useless choicepoints AND doesn't leave the clauses nailed down so
    that they can't be reclaimed:

	retract_facts([]).
	retract_facts([Fact|_]) :-		% the "fail" here means no
		retract(Fact),			% choice point is left AND
		fail.				% the clause be be reclaimed
	retract_facts([_|Facts]) :-
		retract_facts(Facts).

We're left with the top level call:

> 	reduce_objects :-
>		bagof(object(I,X,Y),
>			J^( object(I,X,Y), object(J,X,Y), I < J ),
>			ListOfObjectInstances),
>		retract_facts(ListOfObjectInstances).

Here we observe an interesting point.  Suppose the object/3 table
contains
	object(1, 0, 0).
	object(2, 0, 0).
	object(3, 0, 0).

Let's find ALL the solutions to the query

	object(I, X, Y), object(J, X, Y), I < J

a.	I = 1, X = 0, Y = 0, J = 2
b.	I = 1, X = 0, Y = 0, J = 3
c.	I = 2, X = 0, Y = 0, J = 3

Basically, when you put J^( ... ) around that goal in the call to bagof/3,
you are telling bagof/3 to ignore bindings for that variable.  So the
bindings bagof/3 is going to pay heed to are those for I, X, and Y.  We get

	bindings		ignored		instance
a.	I = 1, X = 0, Y = 0	J = 2	     {	object(1,0,0)
b.	I = 1, X = 0, Y = 0	J = 3	     {	object(1,0,0)
c.	I = 2, X = 0, Y = 0	J = 3	     {	object(2,0,0)
                                             /
bagof/3 collects these answers--------------- and puts them in a list,
so the final answer we get from the call to bagof/3 is that
	I, J, X, Y are left unbound
	ListOfObjectInstances = [object(1,0,0),object(1,0,0),object(2,0,0)]

Notice something about that?  object(1,0,0) occurs twice in it.  (A copy of)
each object is going to occur as many times in the list as there are
more recent objects at the same place; if we have
	object(1, x, y).  object(2, x, y). ... object(N, x, y).
we are going to get (N-1) + (N-2) + ... + 0 = (1/2).N.(N-1) copies of
object(?, x, y) terms in the list.

Now, suppose we call
	?- delete_objects_in_list(object(1,0,0),object(1,0,0),object(2,0,0)]
That's going to do
	retract(object(1,0,0))	-- succeeds and removes fact
	retract(object(1,0,0))	-- FAILS as the fact has already gone!
	retract(object(2,0,0))	-- never reached
So if the data base can look like
	object(1, 0, 0).
	object(2, 0, 0).
	object(3, 0, 0).
(and we have been given no reason to suppose that it *can't*) then the
code using bagof/3 WILL NOT WORK because bagof/3 allows duplicate terms
in the list of answers.

The right thing to do is to use setof/3:

	reduce_objects :-
		setof(object(I,X,Y),
			J^( object(I,X,Y), object(J,X,Y), I < J ),
			SetOfObjectInstances),
		retract_facts(SetOfObjectInstances).

This will find the same solutions as bagof/3, but will then sort the
list to remove duplicates (sorting is the means, removing duplicates
is the purpose), so retract_facts/1 is not going to be given duplicate
facts to remove.  (As it happens, retract_facts/1 doesn't mind duplicate
facts, but that was good luck.)

This illustrates one of the themes of using setof/3 and bagof/3:

	IT IS VERY SELDOM A GOOD IDEA TO USE bagof/3.

Something which still makes me unhappy about the whole thing is the
high cost of finding the object/3 facts to be deleted in the first place.
If there are N object/3 facts, the search as written is like
	for i := 1 to N do
	    unify object(I,X,Y) with <object.3.table>[i]
	    for j := 1 to N do
		unify object(J,X,Y) with <object.3.table>[j]
		if unification succeeded then
		    if I < J then
			save a copy of object(I,X,Y)
which is clearly O(N**2) cost.

Let's try something else.  For each X,Y, let's find the set
	XYIs = {I | object(I,X,Y) is true}
Then every element of that set is to be deleted, except the maximum.

	reduce_objects :-
	    forall(
		setof(J, object(J,X,Y), [I|Is])
	    ,	delete_objects(Is, I, X, Y)
	    ).

	%   Use the next three clauses in a structure-sharing system
	%   or if an object/3 fact may be duplicated

	delete_objects([], _, _, _).	% maximum; leave it
	delete_objects([_|_], I, X, Y) :-
		retract(object(I,X,Y)),
		fail.
	delete_objects([I|Is], _, X, Y) :-
		delete_objects(Is, I, X, Y).

	%   Use the next two clauses in a structure-copying system
	%   when object/3 facts cannot be duplicated.

	delete_objects([], _, _, _).	% maximum; leave it
	delete_objects([I|Is], J, X, Y) :-
		retract(object(J,X,Y)),	% some Prologs index enough,
		!,			% but some don't,
		delete_objects(Is, I, X, Y).

How much does this cost?  Well, it picks up every object/3 clause once,
at a cost of O(N).  What it actually picks up is pairs (X,Y)-I.  It
then sorts these pairs using keysort/2, so that it can group them into
blocks with the same values of X and Y.  The sort costs O(N.lg N) time.
Grouping the sorted solutions into blocks X, Y, Is costs O(N) time, and
then deleting the unwanted clauses costs O(N) time, so the grand total
is O(N.lg N).  Yes, the sort dominates the cost here, but N.lg N is
quite a lot better than N**2.

-- 
Distinguishing between a work written in Hebrew and one written in Aramaic
when we have only a Latin version made from a Greek translation is not easy.
(D.J.Harrington, discussing pseudo-Philo)

crich@holly.axion.bt.co.uk (Clive Rich) (08/03/90)

> Hello.  I'm teaching myself Prolog ...
>   ...
> While I have the _Clocksin & Mellish, 2nd ed._ book, I haven't found a
> *really good* tutorial.

Another Book is well worth a look is Ivan Bratko
Proramming in Prolog

Also Clocksin & Mellish has had further editions since your copy.
It is now in fact a completely different animal. 

Regards Clive Rich

If dont believe in any so-called witty sayings