OKeefe.R.A.%EDXA@sri-unix.UUCP (09/30/83)
From: Richard.HPS ( on ERCC DEC-10 ) <OKeefe.R.A.@EDXA> )
You may already have seen my article in SigPlan where I say how
horrible it is to use assert & retract and you really shouldn't do it.
[ Unless the program is explicitly supposed to maintain a data base,
in which case you have no real alternative. That's my excuse anyway,
when people point the finger at records / recorded / erase in my
code. ] I thought it might be generally interesting if I pointed out
another pernicious effect of asserts and retracts on clauses.
It is a general principal of language design that if you don't
use a feature you shouldn't have to pay for it. Arithmetic
expressions are like that; C Prolog version 1.2D.EDAI has 'succ'
/ 2, and if you want to use a predicate that has some pretence of
reversibility, you can use succ / 2 instead of is / 2, and all you
pay is the code space for the expression evaluator. So is the cut:
the information that '!' needs to have kept for it to work has to
be kept anyway, and if you haven't got any cuts you aren't paying
anything for them.
Assert and retract are not like that. There is a substantial
overhead ( my guesstimate in C Prolog would be about 2-5% of the main
loop costs ) in protecting the interpreter ( or the compiled code's
state ) against the horrible things that assert and retract can do to
predicates that happen to be running.
1. The local stack is reclaimed on determinate exit. If you have
TRO, you can reclaim a little bit earlier than that. There is
a lookahead technique which can spot determinacy much earlier:
you label each clause with the principal functor of its first
argument ( this might be an integer, or NIL for a variable ),
and after you have found a clause that matches, you scan along
the clause list for the next clause with the right first functor,
and record that as the alternative instead of the next clause.
This gives you the space saving benefits that clause indexing
provides in the Dec-10 compiler, but not the time saving.
However: if you are allowed to retract clauses in running
predicates, you cannot use this technique. The clause you
noted as the alternative might be retract before you get there!
To get around this, whenever we retracted a clause we would have
to scan the entire local stack seeing whether this clause was in
there as an alternative, and doing something clever if it was.
Asserting also does nasty things. The clause you assert might
now be a valid alternative for a clause that has already
proceeded on the assumption that it had no alternatives
( E.g. if it had had a cut the cut would have done the wrong
thing ). Micro-PROLOG has an even worse form of this problem,
as it allows you to insert clauses in the middle of predicates,
so that a clause which had an alternative might find the new
clause popping up between itself and what it thought was its
alternative.
Thus the penalty you pay for allowing asserts and retracts on
predicates that might be running is that determinacy is not
spotted as soon as it might be, so that more local stack space
is used than strictly necessary, so that backtracking has a bit
more work to do detecting the hard way determinacy that should
have been spotted earlier, and TRO is considerably less
effective. { C Prolog currently lacks TRO. It could be added,
but I would prefer to waituntil this problem is solved. }
2. What gets really nasty is when you retract a clause you happen
to be running. In a structure sharing system, this means Every
retract, because when you do the pattern match that identifies
the clause to be retracted you are almost certain to bind
variables to molecules referencing skeletons in the retracted
clause. You simply Cannot afford to have a skeleton disappear
from under you. The solution, in both DEC-10 and C Prolog, is
that when you use a clause for the first time, a pointer to it
is put on the trail, and the clause is flagged as in use. When
you backtrack past this point, you can be sure that the clause
is no longer in use, and only then can you reclaim the space.
This means that in
increment(Counter) :-
retract(counter(Counter,M)),
succ(M, N),
assert(counter(Counter,N)).
the retracted clause is Still there in the data base, tying up
space, until the "increment" call Fails. This finally happens
at the top level, because the top level tends to look something
like
repeat, read(X), answer(X), fail.
answer(end_of_file) :- halt.
answer(X) :- call(X). % with stuff for printing answers
All the space you retracted is reclaimed at This fail, if no
sooner. This has puzzled a lot of people whose programs have
run out of space when the total amount of space they were
interested in hanging onto was really quite small. ( The
DEC-10 garbage collector collects garbage in the stacks, not in
the heap. )
So the possibility of a retracted clause still being in use
means that more trail space is tied up ( most procedure calls
end up putting an entry for the current clause on it, though of
course in a highly recursive program each clause will appear on
the trail at most once ), and that failing is more expensive
because the code that resets the trail cannot trust every entry
in the trail to be a pointer to a variable. In C, instead of
for (t = oldtr; t != tr; *(*t++) = NULL) ;
you have to have
for (t = oldtr; t != tr; )
if (is_a_variable(*t)) *(*t++) = NULL;
else maybe_reclaim_clause(*t++);
[ This is Not taken from C Prolog ]
A structure copying system can avoid some of this hassle.
When you pattern match against the clause you are retracting,
you no longer end up with pointers into that clause. ( Even
this isn't true in a good structure copying system which
shares ground subterms. ) However, you still have to protect
against retracting a clause which is running. E.g.,
p :-
clause(p,Body,Ref),
erase(Ref),
fail.
p :-
normal_p.
So you still have to mark and trail those clauses. Not only can
you not reclaim the space of such clauses, you can't remove them
from the clause chain. retract, for example, has to be able to
backtrack through the entire clause chain, and the clause chain
it sees has to remain intact even in the presence of other
procedures retracting from the same predicate. E.g.
p :-
clause(p,Body,Ref),
erase(Ref),
q.
p :-
write('Gotcha!'), nl.
p :-
write('Hey, it worked!'), nl.
q :-
clause(p,Body,Ref),
erase(Ref),
fail.
The question ?- p is supposed to throw away the first two
clauses and print "Hey, it worked!".
So a clause which is erased but in use has to remain in the
clause chain. This means that the interpreter has to check
Every clause it finds to see if it has been erased. ( In
compiled code you could maybe smash one of the instructions
to be FAIL. But you would still be running some of the code
of erased clauses. )
The moral is that even if you don't use assert and retract at all,
you are paying a quite a high price for the possibility.
Version 1.2D.EDAI of C Prolog attains 1000 LIPS on a VAX 750 ( yes,
it is in C and C only ), my estimate is that another 5% speed increase
could be guaranteed if it were not for assert and retract. ( Mind
you, I think that with 300 lines of assembler code I could get
another 20%, but that is another story. ) There would also be a
larger saving of stack space ( just how large depends on how
determinate your programs are ) and implementing TRO would be easier.
The question is, what can we do about it? There is some reason
for hope:
1. Well-written programs don't change running predicates. They may
change tables which a structure-sharing system thinks of as
"running", but having a special mechanism for changeable tables
seems acceptable to me.
2. There are other reasons for prohibiting changes to predicates.
Most compilers ( DEC-10, POPLOG, Micro-Prolog ) are unable to
change compiled predicates except by replacing them completely.
The Prolog-X ( ZIP ) system can handle it, but at the price of
only compiling clauses, and not compiling predicates as such.
3. Throwing a predicate completely away does Not entail all these
overheads. We do have to detect which clauses are running, but
if we forcibly reset their alternative pointer to NIL that makes
as much sense as anything.
4. Using the "recorded" data base in C Prolog entails some of these
overheads but not all. This is really 1. again.
5. There are some good indexing methods for handling tables based on
dynamic hashing. See papers by John W. Lloyd et al from the
University of Melbourne. He has also done some work on indexing
unit clauses with variables. However, there isn't much point in
applying these methods to " program-like " predicates.
I am very much a fan of DEC-10 Prolog, but its authors would agree
that unsurpassed though it is, it is not the last word in logic
programming. In particular, the assert / retract / records /
recorded / instance / erase stuff was never really designed, just
sort of grown, and there is no Prolog specification in existence
which says what it should do.
The data base stuff was debugged by eliminating surprises as they
showed up, this of course depends on what surprises whom as the
recent controversy about negation has shown. ( It seems to boil
down to a question of what the scope rules are, rather than anything
substantial about negation. ) We can and Must find a cleaner way of
specifying changes to the data base. I am so saturated in the way
things currently work that I can't think of anything much better.
If anybody out there has any good ideas, please broadcast them,
even if they're only half-baked. There are N new Prolog
implementations, and too many of them have made the data base stuff
More complicated and Less logical. What we need is something
Simpler that what DEC-10 and C Prolog provide. ( E.g. no
data-base-references please. )OKeefe.R.A.@EDXA@sri-unix.UUCP (09/30/83)
From: Richard.HPS ( on ERCC DEC-10 ) <OKeefe.R.A.@EDXA> )
You may already have seen my article in SigPlan where I say how
horrible it is to use assert & retract and you really shouldn't do it.
[ Unless the program is explicitly supposed to maintain a data base,
in which case you have no real alternative. That's my excuse anyway,
when people point the finger at records / recorded / erase in my
code. ] I thought it might be generally interesting if I pointed out
another pernicious effect of asserts and retracts on clauses.
It is a general principal of language design that if you don't
use a feature you shouldn't have to pay for it. Arithmetic
expressions are like that; C Prolog version 1.2D.EDAI has 'succ'
/ 2, and if you want to use a predicate that has some pretence of
reversibility, you can use succ / 2 instead of is / 2, and all you
pay is the code space for the expression evaluator. So is the cut:
the information that '!' needs to have kept for it to work has to
be kept anyway, and if you haven't got any cuts you aren't paying
anything for them.
Assert and retract are not like that. There is a substantial
overhead ( my guesstimate in C Prolog would be about 2-5% of the main
loop costs ) in protecting the interpreter ( or the compiled code's
state ) against the horrible things that assert and retract can do to
predicates that happen to be running.
1. The local stack is reclaimed on determinate exit. If you have
TRO, you can reclaim a little bit earlier than that. There is
a lookahead technique which can spot determinacy much earlier:
you label each clause with the principal functor of its first
argument ( this might be an integer, or NIL for a variable ),
and after you have found a clause that matches, you scan along
the clause list for the next clause with the right first functor,
and record that as the alternative instead of the next clause.
This gives you the space saving benefits that clause indexing
provides in the Dec-10 compiler, but not the time saving.
However: if you are allowed to retract clauses in running
predicates, you cannot use this technique. The clause you
noted as the alternative might be retract before you get there!
To get around this, whenever we retracted a clause we would have
to scan the entire local stack seeing whether this clause was in
there as an alternative, and doing something clever if it was.
Asserting also does nasty things. The clause you assert might
***Sender closed connection***
=== Network Mail from host su-score on Fri Sep 23 20:16:32 ===