OKeefe.R.A.%EDXA@sri-unix.UUCP (09/30/83)
From: Richard.HPS ( on ERCC DEC-10 ) <OKeefe.R.A.@EDXA> ) You may already have seen my article in SigPlan where I say how horrible it is to use assert & retract and you really shouldn't do it. [ Unless the program is explicitly supposed to maintain a data base, in which case you have no real alternative. That's my excuse anyway, when people point the finger at records / recorded / erase in my code. ] I thought it might be generally interesting if I pointed out another pernicious effect of asserts and retracts on clauses. It is a general principal of language design that if you don't use a feature you shouldn't have to pay for it. Arithmetic expressions are like that; C Prolog version 1.2D.EDAI has 'succ' / 2, and if you want to use a predicate that has some pretence of reversibility, you can use succ / 2 instead of is / 2, and all you pay is the code space for the expression evaluator. So is the cut: the information that '!' needs to have kept for it to work has to be kept anyway, and if you haven't got any cuts you aren't paying anything for them. Assert and retract are not like that. There is a substantial overhead ( my guesstimate in C Prolog would be about 2-5% of the main loop costs ) in protecting the interpreter ( or the compiled code's state ) against the horrible things that assert and retract can do to predicates that happen to be running. 1. The local stack is reclaimed on determinate exit. If you have TRO, you can reclaim a little bit earlier than that. There is a lookahead technique which can spot determinacy much earlier: you label each clause with the principal functor of its first argument ( this might be an integer, or NIL for a variable ), and after you have found a clause that matches, you scan along the clause list for the next clause with the right first functor, and record that as the alternative instead of the next clause. This gives you the space saving benefits that clause indexing provides in the Dec-10 compiler, but not the time saving. However: if you are allowed to retract clauses in running predicates, you cannot use this technique. The clause you noted as the alternative might be retract before you get there! To get around this, whenever we retracted a clause we would have to scan the entire local stack seeing whether this clause was in there as an alternative, and doing something clever if it was. Asserting also does nasty things. The clause you assert might now be a valid alternative for a clause that has already proceeded on the assumption that it had no alternatives ( E.g. if it had had a cut the cut would have done the wrong thing ). Micro-PROLOG has an even worse form of this problem, as it allows you to insert clauses in the middle of predicates, so that a clause which had an alternative might find the new clause popping up between itself and what it thought was its alternative. Thus the penalty you pay for allowing asserts and retracts on predicates that might be running is that determinacy is not spotted as soon as it might be, so that more local stack space is used than strictly necessary, so that backtracking has a bit more work to do detecting the hard way determinacy that should have been spotted earlier, and TRO is considerably less effective. { C Prolog currently lacks TRO. It could be added, but I would prefer to waituntil this problem is solved. } 2. What gets really nasty is when you retract a clause you happen to be running. In a structure sharing system, this means Every retract, because when you do the pattern match that identifies the clause to be retracted you are almost certain to bind variables to molecules referencing skeletons in the retracted clause. You simply Cannot afford to have a skeleton disappear from under you. The solution, in both DEC-10 and C Prolog, is that when you use a clause for the first time, a pointer to it is put on the trail, and the clause is flagged as in use. When you backtrack past this point, you can be sure that the clause is no longer in use, and only then can you reclaim the space. This means that in increment(Counter) :- retract(counter(Counter,M)), succ(M, N), assert(counter(Counter,N)). the retracted clause is Still there in the data base, tying up space, until the "increment" call Fails. This finally happens at the top level, because the top level tends to look something like repeat, read(X), answer(X), fail. answer(end_of_file) :- halt. answer(X) :- call(X). % with stuff for printing answers All the space you retracted is reclaimed at This fail, if no sooner. This has puzzled a lot of people whose programs have run out of space when the total amount of space they were interested in hanging onto was really quite small. ( The DEC-10 garbage collector collects garbage in the stacks, not in the heap. ) So the possibility of a retracted clause still being in use means that more trail space is tied up ( most procedure calls end up putting an entry for the current clause on it, though of course in a highly recursive program each clause will appear on the trail at most once ), and that failing is more expensive because the code that resets the trail cannot trust every entry in the trail to be a pointer to a variable. In C, instead of for (t = oldtr; t != tr; *(*t++) = NULL) ; you have to have for (t = oldtr; t != tr; ) if (is_a_variable(*t)) *(*t++) = NULL; else maybe_reclaim_clause(*t++); [ This is Not taken from C Prolog ] A structure copying system can avoid some of this hassle. When you pattern match against the clause you are retracting, you no longer end up with pointers into that clause. ( Even this isn't true in a good structure copying system which shares ground subterms. ) However, you still have to protect against retracting a clause which is running. E.g., p :- clause(p,Body,Ref), erase(Ref), fail. p :- normal_p. So you still have to mark and trail those clauses. Not only can you not reclaim the space of such clauses, you can't remove them from the clause chain. retract, for example, has to be able to backtrack through the entire clause chain, and the clause chain it sees has to remain intact even in the presence of other procedures retracting from the same predicate. E.g. p :- clause(p,Body,Ref), erase(Ref), q. p :- write('Gotcha!'), nl. p :- write('Hey, it worked!'), nl. q :- clause(p,Body,Ref), erase(Ref), fail. The question ?- p is supposed to throw away the first two clauses and print "Hey, it worked!". So a clause which is erased but in use has to remain in the clause chain. This means that the interpreter has to check Every clause it finds to see if it has been erased. ( In compiled code you could maybe smash one of the instructions to be FAIL. But you would still be running some of the code of erased clauses. ) The moral is that even if you don't use assert and retract at all, you are paying a quite a high price for the possibility. Version 1.2D.EDAI of C Prolog attains 1000 LIPS on a VAX 750 ( yes, it is in C and C only ), my estimate is that another 5% speed increase could be guaranteed if it were not for assert and retract. ( Mind you, I think that with 300 lines of assembler code I could get another 20%, but that is another story. ) There would also be a larger saving of stack space ( just how large depends on how determinate your programs are ) and implementing TRO would be easier. The question is, what can we do about it? There is some reason for hope: 1. Well-written programs don't change running predicates. They may change tables which a structure-sharing system thinks of as "running", but having a special mechanism for changeable tables seems acceptable to me. 2. There are other reasons for prohibiting changes to predicates. Most compilers ( DEC-10, POPLOG, Micro-Prolog ) are unable to change compiled predicates except by replacing them completely. The Prolog-X ( ZIP ) system can handle it, but at the price of only compiling clauses, and not compiling predicates as such. 3. Throwing a predicate completely away does Not entail all these overheads. We do have to detect which clauses are running, but if we forcibly reset their alternative pointer to NIL that makes as much sense as anything. 4. Using the "recorded" data base in C Prolog entails some of these overheads but not all. This is really 1. again. 5. There are some good indexing methods for handling tables based on dynamic hashing. See papers by John W. Lloyd et al from the University of Melbourne. He has also done some work on indexing unit clauses with variables. However, there isn't much point in applying these methods to " program-like " predicates. I am very much a fan of DEC-10 Prolog, but its authors would agree that unsurpassed though it is, it is not the last word in logic programming. In particular, the assert / retract / records / recorded / instance / erase stuff was never really designed, just sort of grown, and there is no Prolog specification in existence which says what it should do. The data base stuff was debugged by eliminating surprises as they showed up, this of course depends on what surprises whom as the recent controversy about negation has shown. ( It seems to boil down to a question of what the scope rules are, rather than anything substantial about negation. ) We can and Must find a cleaner way of specifying changes to the data base. I am so saturated in the way things currently work that I can't think of anything much better. If anybody out there has any good ideas, please broadcast them, even if they're only half-baked. There are N new Prolog implementations, and too many of them have made the data base stuff More complicated and Less logical. What we need is something Simpler that what DEC-10 and C Prolog provide. ( E.g. no data-base-references please. )
OKeefe.R.A.@EDXA@sri-unix.UUCP (09/30/83)
From: Richard.HPS ( on ERCC DEC-10 ) <OKeefe.R.A.@EDXA> ) You may already have seen my article in SigPlan where I say how horrible it is to use assert & retract and you really shouldn't do it. [ Unless the program is explicitly supposed to maintain a data base, in which case you have no real alternative. That's my excuse anyway, when people point the finger at records / recorded / erase in my code. ] I thought it might be generally interesting if I pointed out another pernicious effect of asserts and retracts on clauses. It is a general principal of language design that if you don't use a feature you shouldn't have to pay for it. Arithmetic expressions are like that; C Prolog version 1.2D.EDAI has 'succ' / 2, and if you want to use a predicate that has some pretence of reversibility, you can use succ / 2 instead of is / 2, and all you pay is the code space for the expression evaluator. So is the cut: the information that '!' needs to have kept for it to work has to be kept anyway, and if you haven't got any cuts you aren't paying anything for them. Assert and retract are not like that. There is a substantial overhead ( my guesstimate in C Prolog would be about 2-5% of the main loop costs ) in protecting the interpreter ( or the compiled code's state ) against the horrible things that assert and retract can do to predicates that happen to be running. 1. The local stack is reclaimed on determinate exit. If you have TRO, you can reclaim a little bit earlier than that. There is a lookahead technique which can spot determinacy much earlier: you label each clause with the principal functor of its first argument ( this might be an integer, or NIL for a variable ), and after you have found a clause that matches, you scan along the clause list for the next clause with the right first functor, and record that as the alternative instead of the next clause. This gives you the space saving benefits that clause indexing provides in the Dec-10 compiler, but not the time saving. However: if you are allowed to retract clauses in running predicates, you cannot use this technique. The clause you noted as the alternative might be retract before you get there! To get around this, whenever we retracted a clause we would have to scan the entire local stack seeing whether this clause was in there as an alternative, and doing something clever if it was. Asserting also does nasty things. The clause you assert might ***Sender closed connection*** === Network Mail from host su-score on Fri Sep 23 20:16:32 ===