[net.lang.prolog] "assert" considered harmful?

dave@lsuc.UUCP (David Sherman) (06/01/86)

Saumya Debray mentioned recently on the net (in <126@sbcs.UUCP>)
that good Prolog programmers don't make much use of assert and retract.
Although my exposure to Prolog has been limited, I've always felt that
somehow this must be true - assert and retract start mucking with the
very predicates that Prolog's trying to use. I can sort of imagine
the Dijkstras of the Prolog world intoning "Assert Considered Harmful"
and explaining why, like GOTO in conventional programming languages,
assert and retract really shouldn't be used much.

But now I wonder. I'm developing this Canadian income tax planning
system. I find that on even a simple set of facts it has to do
several thousand predicate calls (matches, logical inferences, whatever
you call them), and I'm nowhere near done implementing all the rules
I want to put in. When I look at the logic, I find it's doing the same
analysis over and over for certain legal conclusions that are really
"facts" for other rules to deal with. For example:
	related(Taxpayer1, Taxpayer2) :-
		tptype(Taxpayer2, corporation),
		controls(Taxpayer1, Taxpayer2).

Now, "controls" can be viewed as a fact when considering whether
T1 and T2 are related, but actually it's a predicate that takes a
whole lot of analysis (in its simplest incarnation, it looks for all
the outstanding common shares in T2, looks for the owners of those
shares to match T1, totals up the two numbers and checks to see if
T1's shares exceed 50% of the total).

Once I've determined that T1 controls T2, should I "asserta" that
as a fact, so it no longer needs to take much time? And having
done so, do I then "asserta" the fact that they are related? Many
of the rules which I'm implementing have an initial test of relatedness
or control, and obviously the analysis will be much more efficient if
the program can decide almost instantly whether to take a particular
analysis path or not.

There's a further complication, too. Most of the rules need to know
whether a given pair of taxpayers are related *at a particular point
in time*. So if I start using assert, I can imagine that I'll have to
run a set of asserts for each relevant time period during the several
transactions which the system would be analysing (since control will
change due to the transactions in corporate reorganizations, for example).

Comments?

Dave Sherman
The Law Society of Upper Canada
Toronto
416 947 3466
-- 
{ ihnp4!utzoo  pesnta  utcs  hcr  decvax!utcsri  } !lsuc!dave

lamy@utai.UUCP (Jean-Francois Lamy) (06/01/86)

In article <1229@lsuc.UUCP> dave@lsuc.UUCP (David Sherman) writes:
>[...] When I look at the logic, I find it's doing the same
>analysis over and over for certain legal conclusions that are really
>"facts" for other rules to deal with. For example:
>	related(Taxpayer1, Taxpayer2) :-
>		tptype(Taxpayer2, corporation),
>		controls(Taxpayer1, Taxpayer2).
>
>Now, "controls" can be viewed as a fact when considering whether
>T1 and T2 are related, but actually it's a predicate that takes a
>whole lot of analysis (in its simplest incarnation, it looks for all

Using assert and retract as a caching mechanism for inferences has far
reaching implications.  What you really want to say is: in this fiscal year, A
controls B, but your are telling this using a predicate (assert) that really
means "It is a theorem that A controls B".

Under the logical interpretation, what you have asserted in one execution
should be present in the next execution of your program.  This would break
under your use of 'assert', because what is true in one fiscal year may not be
true in the next.

You may know that all information is related to only one fiscal year and,
under that assumption, you may convince yourself that no undesirable inference
will occur because of your extra assertions.  But I consider this to be
'programming' if the assumptions made (about time, say) are not or cannot be
put as axioms in the knowledge base. Furthermore, your reasoning probably
requires knowledge of the underlying implementation of 'assert'

Happy new June!
-- 

Jean-Francois Lamy	        CSNet: lamy@ai.toronto.edu
Department of Computer Science 	EAN:   lamy@ai.toronto.cdn
University of Toronto          	ARPA:  lamy%ai.toronto.edu@csnet-relay
Toronto, Ontario	       	UUCP:  lamy@utai.uucp
M5S 1A4                        	       {ihnp4,decvax,decwrl}!utcsri!utai!lamy

tim@druhi.UUCP (MorrisseyTJ) (06/03/86)

In article <1229@lsuc.UUCP>, dave@lsuc.UUCP writes:
> Saumya Debray mentioned recently on the net (in <126@sbcs.UUCP>)
> that good Prolog programmers don't make much use of assert and retract.
> [text deleted]
>                   When I look at the logic, I find it's doing the same
> analysis over and over for certain legal conclusions that are really
> "facts" for other rules to deal with.
> [text deleted]
> Once I've determined that T1 controls T2, should I "asserta" that
> as a fact, so it no longer needs to take much time?
> [text deleted]

Is this an example of lemmas?

I would like to believe that a formal mechanism for managing and
applying previously proved goals could significantly improve the speed
of large programs.  However, I do see many issues like:
	- knowing when the lemmas are no longer valid
	- making the time cost cheap enough to cause overall speedup
	- keeping the space cost "low enough"
	- knowing when *not* to store lemmas (I/O, external conditions)

---------------

Although it is easy to misuse assert and retract, I think Prolog is very
valuable as a database language.  Databases for practical applications
can easily have dozens of relations and change very often.

Something I find dearly missing from Prolog are uniform semantics for
side-effects.  It seems "ugly" that applications and even Prolog support code
use predicates like:

	set_xxx(Value)
	get_xxx(Value)

		or

	add_xxx(Key,Value)
	find_xxx(Key,Value)
	del_xxx(Key)


Just some food for thought.


Tim Morrissey

debray@sbcs.UUCP (Saumya Debray) (06/04/86)

> When I look at the logic [of an income tax planning system], I find
> it's doing the same analysis over and over for certain legal conclusions
> that are really "facts" for other rules to deal with. 
	...
> Once I've determined that T1 controls T2, should I "asserta" that
> as a fact, so it no longer needs to take much time? 

As Jean-Francois Lamy mentioned, "assert" is an overkill if all one
wants is the ability to remember what's already been proved.  One of
the features in the Prolog system we're developing at Stony Brook is the
ability to declare that certain predicates should have "extension tables"
or "recall tables" maintained.  This is basically a table where each entry
is of the form < Call, [Return_1, ..., Return_k] >.  Any call to that
predicate is first looked up in the table: if a return value is already
present, the call can return immediately with the appropriate answer without
having to recompute it; otherwise, the call is made, and if/when it returns,
this <call, return> pair is entered in the table for later use.  The idea
is similar to that of "memo functions" (though the implementation is
quite a bit more complex).  While the implementation of the extension table
facility uses assert, this is of no concern to the programmer, who can
continue to write pure code.


-- 
Saumya Debray
SUNY at Stony Brook

	uucp: {allegra, philabs, ogcvax} !sbcs!debray
	arpa: debray%suny-sb.csnet@csnet-relay.arpa
	CSNet: debray@sbcs.csnet

andrews@ubc-cs.UUCP (Jamie Andrews) (06/06/86)

     I would say that if your only interest in Prolog is as a programming
language with neat features like backtracking and logic variables, then
by all means go ahead and use "assert" and "retract".  Just be aware of
how they affect the control flow in the program.

     It should be of little concern to applications programmers that
these features destroy the declarative reading of the program, unless they
have to prove to their bosses that their applications are rigourously
correct.  However, there are ways of implementing global variables and
assert-and-retract-like behaviour in a completely declarative setting;
see Shapiro's papers about concurrent Prolog and message-passing.  (The
object-oriented approach Shapiro advocates also explains I/O nicely.)

--Jamie.
...!seismo!ubc-vision!ubc-cs!andrews
"I believe in Santa Claus, and the DoD believes in Ada" -D.Parnas

dave@lsuc.UUCP (06/12/86)

(I must say this interaction with knowledgeable people on the net, from
the point of view of someone merely trying to design an application, is
fascinating. Thanks to everyone who's contributed so far.)

In article <269@ubc-cs.UUCP> andrews@ubc-cs.UUCP (Jamie Andrews) writes:
>
>     It should be of little concern to applications programmers that
>these features destroy the declarative reading of the program, unless they
>have to prove to their bosses that their applications are rigorously
>correct. 

Interesting issue. What happens if I succeed in designing a tool
which can be used for corporate tax planning, and I want to make
it a commercial product? Should it be "provably" correct before
lawyers use it?

What I've done with the problem, incidentally, is do some
initial "setup" work which analyses all of the relationships
stated in the facts and asserts the things it determines to
be true. So, for example, I have the definition of control
(>50% of voting power, etc.) in a predicate called controls_rule,
and the predicate "controls" either exists as a stated fact (set
up by the user) or is asserted at setup time. Thereafter all tests
refer simply to "controls". Same thing for "related_rule" and
"related", which incidentally solves the thorny problem of
circularity caused by definitions which define related
taxpayers in terms of other related taxpayers - I simply call
the setup routine enough times to make sure all derivative
relationships are asserted. (Hope that isn't too obscure.)

Of course, the thing will get a lot more difficult when I
properly implement time, which is crucial to the system. But
I think it's still reasonable to determine what each time
interval is that's relevant, and make the assertions relative
to those time intervals.

As long as I limit my assertions by restricting the time interval,
then I don't really have to worry about the difference between
"assert" and "assume" which was mentioned by Anand Rao, do I?

Dave Sherman
The Law Society of Upper Canada
Toronto
-- 
{ ihnp4!utzoo  pesnta  utcs  hcr  decvax!utcsri  } !lsuc!dave

bobd@zaphod.UUCP (Bob Dalgleish) (06/21/86)

In the suggested application, this is the way that I would use assert:

controls(T1, T2) :- known_controls(T1, T2).
controls(T1, T2) :- figure_out_controlling_interests(T1,T2),
		    assert(known_controls(T1,T2)).

You now have a "program" portion (controls), and a data caching portion
(known_controls) which are separated.  Certainly, we need to separate
the two issues of program purity (required to support compilation), and
academic purity (required to support ivory towers).

Tax-planning is a very good application for an expert system, and using
standard computational science methods to make the implementation viable
is in all of our best interests.

Adding time into the database should not be that difficult, since it is
expressed as:

known_controls(Time,Taxpayer1,Taxpayer2) ...

When the time is unknown or irrelevant for the period of interest,
express it as a construct that matches all time (i.e., a variable).
-- 
[Forgive me, Father, for I have signed ...]
Bob Dalgleish		...ihnp4!{alberta!}sask!zaphod!bobd
(My mother has disclaimed any knowledge of me)

dave@lsuc.UUCP (David Sherman) (06/29/86)

In article <561@zaphod.UUCP> bobd@zaphod.UUCP (Bob Dalgleish) writes:
>Adding time into the database should not be that difficult, since it is
>expressed as:
>
>known_controls(Time,Taxpayer1,Taxpayer2) ...
>
>When the time is unknown or irrelevant for the period of interest,
>express it as a construct that matches all time (i.e., a variable).

That's fine for a single statement. But for the definition of the
various rules which apply to corporate reorganizations, we need to
know whether a particular fact is true at a given time, which is unlikely
to coincide with any specific time specified as a fact. Getting this
to work require something along the lines of Kowalski's "holds"
formulism, as best I can figure out.  (That is, make the statement
holds(fact(..., ...), timeN), and assume that unless it's been terminated,
the fact is true at any time after timeN.)

Dave Sherman
The Law Society of Upper Canada
Toronto
-- 
{ ihnp4!utzoo  pesnta  utcs  hcr  decvax!utcsri  } !lsuc!dave