[net.lang.prolog] Some Thoughts On Assert & Retract

OKeefe.R.A.%EDXA@sri-unix.UUCP (09/30/83)

From:  Richard.HPS ( on ERCC DEC-10 ) <OKeefe.R.A.@EDXA> )

You may already have seen my article in SigPlan where I say how
horrible it is to use assert & retract and you really shouldn't do it.
[ Unless the program is explicitly supposed to maintain a data base,
in which case you have no real alternative.  That's my excuse anyway,
when people point the finger at records / recorded / erase in my
code. ] I thought it might be generally interesting if I pointed out
another pernicious effect of asserts and retracts on clauses.

It is a general principal of language design that if you don't
use a feature you shouldn't have to pay for it.  Arithmetic
expressions are like that; C Prolog version 1.2D.EDAI has 'succ' 
/ 2, and if you want to use a predicate that has some pretence of
reversibility, you can use succ / 2 instead of is / 2, and all you
pay is the code space for the expression evaluator.  So is the cut:
the information that '!' needs to have kept for it to work has to
be kept anyway, and if you haven't got any cuts you aren't paying
anything for them.

Assert and retract are not like that.  There is a substantial
overhead ( my guesstimate in C Prolog would be about 2-5% of the main
loop costs ) in protecting the interpreter ( or the compiled code's
state ) against the horrible things that assert and retract can do to
predicates that happen to be running.

1.   The local stack is reclaimed on determinate exit.  If you have
     TRO, you can reclaim a little bit earlier than that.  There is
     a lookahead technique which can spot determinacy much earlier:
     you label each clause with the principal functor of its first
     argument ( this might be an integer, or NIL for a variable ),
     and after you have found a clause that matches, you scan along
     the clause list for the next clause with the right first functor,
     and record that as the alternative instead of the next clause.
     This gives you the space saving benefits that clause indexing
     provides in the Dec-10 compiler, but not the time saving.

     However: if you are allowed to retract clauses in running
     predicates, you cannot use this technique.  The clause you
     noted as the alternative might be retract before you get there!
     To get around this, whenever we retracted a clause we would have
     to scan the entire local stack seeing whether this clause was in
     there as an alternative, and doing something clever if it was.

     Asserting also does nasty things.  The clause you assert might
     now be a valid alternative for a clause that has already
     proceeded on the assumption that it had no alternatives
     ( E.g. if it had had a cut the cut would have done the wrong
     thing ).  Micro-PROLOG has an even worse form of this problem,
     as it allows you to insert clauses in the middle of predicates,
     so that a clause which had an alternative might find the new
     clause popping up between itself and what it thought was its
     alternative.

     Thus the penalty you pay for allowing asserts and retracts on
     predicates that might be running is that determinacy is not
     spotted as soon as it might be, so that more local stack space
     is used than strictly necessary, so that backtracking has a bit
     more work to do detecting the hard way determinacy that should
     have been spotted earlier, and TRO is considerably less
     effective. { C Prolog currently lacks TRO.  It could be added,
     but I would prefer to waituntil this problem is solved. }

2.   What gets really nasty is when you retract a clause you happen
     to be running.  In a structure sharing system, this means Every
     retract, because when you do the pattern match that identifies
     the clause to be retracted you are almost certain to bind
     variables to molecules referencing skeletons in the retracted
     clause.  You simply Cannot afford to have a skeleton disappear
     from under you.  The solution, in both DEC-10 and C Prolog, is
     that when you use a clause for the first time, a pointer to it
     is put on the trail, and the clause is flagged as in use.  When
     you backtrack past this point, you can be sure that the clause
     is no longer in use, and only then can you reclaim the space.
     This means that in

                increment(Counter) :-
                        retract(counter(Counter,M)),
                        succ(M, N),
                        assert(counter(Counter,N)).

     the retracted clause is Still there in the data base, tying up
     space, until the "increment" call Fails.  This finally happens
     at the top level, because the top level tends to look something
     like


                repeat, read(X), answer(X), fail.
        answer(end_of_file) :- halt.

        answer(X) :- call(X).     %  with stuff for printing answers


     All the space you retracted is reclaimed at This fail, if no
     sooner.  This has puzzled a lot of people whose programs have
     run out of space when the total amount of space they were
     interested in hanging onto was really quite small.  ( The
     DEC-10 garbage collector collects garbage in the stacks, not in
     the heap. )

     So the possibility of a retracted clause still being in use
     means that more trail space is tied up ( most procedure calls
     end up putting an entry for the current clause on it, though of
     course in a highly recursive program each clause will appear on
     the trail at most once ), and that failing is more expensive
     because the code that resets the trail cannot trust every entry
     in the trail to be a pointer to a variable.  In C, instead of

                for (t = oldtr; t != tr; *(*t++) = NULL) ;

     you have to have

                for (t = oldtr; t != tr; )
                    if (is_a_variable(*t)) *(*t++) = NULL;
                    else maybe_reclaim_clause(*t++);

     [ This is Not taken from C Prolog ]

     A structure copying system can avoid some of this hassle.
     When you pattern match against the clause you are retracting,
     you no longer end up with pointers into that clause.  ( Even
     this isn't true in a good structure copying system which
     shares ground subterms. ) However, you still have to protect
     against retracting a clause which is running.  E.g.,

                p :-
                        clause(p,Body,Ref),
                        erase(Ref),
                        fail.
                p :-
                        normal_p.

     So you still have to mark and trail those clauses.  Not only can
     you not reclaim the space of such clauses, you can't remove them
     from the clause chain.  retract, for example, has to be able to
     backtrack through the entire clause chain, and the clause chain
     it sees has to remain intact even in the presence of other
     procedures retracting from the same predicate.  E.g.

                p :-
                        clause(p,Body,Ref),
                        erase(Ref),
                        q.
                p :-
                        write('Gotcha!'), nl.
                p :-
                        write('Hey, it worked!'), nl.

                q :-
                        clause(p,Body,Ref),
                        erase(Ref),
                        fail.

     The question ?- p is supposed to throw away the first two
     clauses and print "Hey, it worked!".

     So a clause which is erased but in use has to remain in the
     clause chain.  This means that the interpreter has to check
     Every clause it finds to see if it has been erased.  ( In
     compiled code you could maybe smash one of the instructions
     to be FAIL.  But you would still be running some of the code
     of erased clauses. )

The moral is that even if you don't use assert and retract at all,
you are paying a quite a high price for the possibility.
Version 1.2D.EDAI of C Prolog attains 1000 LIPS on a VAX 750 ( yes,
it is in C and C only ), my estimate is that another 5% speed increase
could be guaranteed if it were not for assert and retract.  ( Mind
you, I think that with 300 lines of assembler code I could get
another 20%, but that is another story. )  There would also be a
larger saving of stack space ( just how large depends on how
determinate your programs are ) and implementing TRO would be easier.


The question is, what can we do about it? There is some reason
for hope:


1.  Well-written programs don't change running predicates.  They may
    change tables which a structure-sharing system thinks of as
    "running", but having a special mechanism for changeable tables
     seems acceptable to me.

2.  There are other reasons for prohibiting changes to predicates.
    Most compilers ( DEC-10, POPLOG, Micro-Prolog ) are unable to
    change compiled predicates except by replacing them completely.
    The Prolog-X ( ZIP ) system can handle it, but at the price of
    only compiling clauses, and not compiling predicates as such.

3.  Throwing a predicate completely away does Not entail all these
    overheads.  We do have to detect which clauses are running, but
    if we forcibly reset their alternative pointer to NIL that makes
    as much sense as anything.

4.  Using the "recorded" data base in C Prolog entails some of these
    overheads but not all.  This is really 1. again.

5.  There are some good indexing methods for handling tables based on
    dynamic hashing.  See papers by John W. Lloyd et al from the
    University of Melbourne.  He has also done some work on indexing
    unit clauses with variables.  However, there isn't much point in
    applying these methods to " program-like " predicates.

I am very much a fan of DEC-10 Prolog, but its authors would agree
that unsurpassed though it is, it is not the last word in logic
programming. In particular, the assert / retract / records /
recorded / instance / erase stuff was never really designed, just
sort of grown, and there is no Prolog specification in existence
which says what it should do.

The data base stuff was debugged by eliminating surprises as they
showed up, this of course depends on what surprises whom as the
recent controversy about negation has shown.  ( It seems to boil
down to a question of what the scope rules are, rather than anything
substantial about negation. )  We can and Must find a cleaner way of
specifying changes to the data base.  I am so saturated in the way
things currently work that I can't think of anything much better.

If anybody out there has any good ideas, please broadcast them,
even if they're only half-baked.  There are N new Prolog
implementations, and too many of them have made the data base stuff
More complicated and Less logical.  What we need is something
Simpler that what DEC-10 and C Prolog provide.  ( E.g. no
data-base-references please. )

OKeefe.R.A.@EDXA@sri-unix.UUCP (09/30/83)

From:  Richard.HPS ( on ERCC DEC-10 ) <OKeefe.R.A.@EDXA> )

You may already have seen my article in SigPlan where I say how
horrible it is to use assert & retract and you really shouldn't do it.
[ Unless the program is explicitly supposed to maintain a data base,
in which case you have no real alternative.  That's my excuse anyway,
when people point the finger at records / recorded / erase in my
code. ] I thought it might be generally interesting if I pointed out
another pernicious effect of asserts and retracts on clauses.

It is a general principal of language design that if you don't
use a feature you shouldn't have to pay for it.  Arithmetic
expressions are like that; C Prolog version 1.2D.EDAI has 'succ' 
/ 2, and if you want to use a predicate that has some pretence of
reversibility, you can use succ / 2 instead of is / 2, and all you
pay is the code space for the expression evaluator.  So is the cut:
the information that '!' needs to have kept for it to work has to
be kept anyway, and if you haven't got any cuts you aren't paying
anything for them.

Assert and retract are not like that.  There is a substantial
overhead ( my guesstimate in C Prolog would be about 2-5% of the main
loop costs ) in protecting the interpreter ( or the compiled code's
state ) against the horrible things that assert and retract can do to
predicates that happen to be running.

1.   The local stack is reclaimed on determinate exit.  If you have
     TRO, you can reclaim a little bit earlier than that.  There is
     a lookahead technique which can spot determinacy much earlier:
     you label each clause with the principal functor of its first
     argument ( this might be an integer, or NIL for a variable ),
     and after you have found a clause that matches, you scan along
     the clause list for the next clause with the right first functor,
     and record that as the alternative instead of the next clause.
     This gives you the space saving benefits that clause indexing
     provides in the Dec-10 compiler, but not the time saving.

     However: if you are allowed to retract clauses in running
     predicates, you cannot use this technique.  The clause you
     noted as the alternative might be retract before you get there!
     To get around this, whenever we retracted a clause we would have
     to scan the entire local stack seeing whether this clause was in
     there as an alternative, and doing something clever if it was.

     Asserting also does nasty things.  The clause you assert might

***Sender closed connection***

=== Network Mail from host su-score on Fri Sep 23 20:16:32  ===