[comp.lang.prolog] Logic of Database Updates, Destructive assignments

voda@ubc-cs (Paul Voda) (03/21/88)

There are two situations in logic programming where we need a good
logical explanation to the in place modification (destructive changes)
of data structures:

     a) change of states
     b) updates to data-bases

The following note discusses the logic of these as well as the solution
adopted in the programming language Trilogy.

The first situation occurs when a data structure encoding a state is
modified by a predicate:

  one_step(Old_state,New_state,....) <- ....

Quite often the state is encoded by a list, but even if it is a functor
or a tree there is a significant amount of copying to be done in order
to create a new state.

It seems to be only natural to create the new state by the in place
modification of the old state. This can be done with reassignable
variables without destroying the declarative semantics.
We have to adopt within the framework of logic programming
the technique Rick Hehner and Tony Hoare have used to explicate
the procedural programs.

I have used this technique
to integrate the procedural with the logic programming in the programming
language Trilogy. As the result
Trilogy does not have a single extra-logical feature.

The above predicate can be written and implemented in Trilogy
as a one place predicate using an input-output variable of type T:

   pred One_step(state :. T) iff ... & state := f(state,...)

This is very fast because the change is done by a destructive assignment
to the input-output variable "state".

At the same time, the declarative reading is not lost because
logically the predicate One_step is just an abreviation for
a two place-predicate

   One_step(state_in, state_out) iff ... & state_out = f(state_in,...)

Note that this technique of explaining modifications of state variables
requires that after the new state has been computed the old
state is not accessible anymore.

This situation is recognized by Trilogy because of the modes attached
to the predicates. Theoretically it might be possible for a clever
Prolog processor (there are no such now and will not be for long time)
to recognize the copying situation within the setting of explicit old-state
and new-state variables. The processor could determine that the old-state
is not needeed anymore after the new-state has been computed and reuse
the space. It is however doubtful whether this can be done in a sufficiently
general setting.

But even assuming that it might be possible it seems that some loss of
readability occurs. The explicit use of old and new state variables doubles
the number of arguments to predicates and, in a presence of two or three such
variables, the resulting predicate is very hard to read. But even without
this, the notation

         a(i) := v

changing the array "a" in place is much more readable then its declarative
reading

         replacearray(a_in,a_out,i,v).

The point I am trying to make here is that from my own experience
(with Trilogy and otherwise) only about
five percent of programmer's time (if at all) is spent on the reasoning
about a program where the explicit old-state new-state variables are easier
to work with. The compact procedural notation of not having to use explicitly
pairs of variables is much easier to work with the remaining
95% of time when the programmer designs and modifies his code.

Now, there are many Prolog systems which permit the in place modification
of data structures generally by means of a hack. The problem with such hacks
is that they do not pay any attention to the logic-destroying situations
such as

   y := x & P(x,y)

where P is defined as follows:

   P(a,b) iff a := c & b := d

If the destructive assignment just sets the variable y to point to x, the
logic is lost because the change to x within P changes y as well. In other
words, P does not have its declarative meaning

   P(a_in,a_out,b_in,b_out) iff a_out = c & b_out = d

Trilogy is careful to preserve the logic in similar situation by copying out the
value of y into the variable x. One has to be also careful with the
alliasing situations as in the call P(x,x) where the single variable
x is modified twice by P. Probably the simplest solution to the aliasing is to
disallow it by the language processor. This is by no means a simple task and
the current version of Trilogy while preserving the logic in the situations
of the first kind achieves the declarative reading only in the absence of
aliasing. By the end of the year we hope to have a fool-proof aliasing checker
built into Trilogy.

The predicates of Trilogy defined with the modifier "pred" as in

   pred P(....) iff ....

correspond to the non-deterministic predicates of Prolog. Any change to an
input-output variable within P must be trailed and undone upon
backtracking. The Trilogy solution of the Triangle puzzle was such.

On the other hand the declaration

   proc P(....) iff ....

specifies a deterministic predicate which can never backtrack and the native
code compiler of Trilogy is able to check this. Consequently, there is
no need to trail input-output variables in the procedures of Trilogy. Note
that it is impossible to do in Prolog.

The propositional theorem prover of Ross Overbeek is an excellent example
of a non-backtracking code which is hopelessly slow without the input-output
variables. I think that this example should convince the logic programmers
that the pure Prolog is constitutionally unable to deal with such
procedural programs. Any attempt to fix the situation by introducing
hacks into Prolog can only make the situation worse.

Trilogy, on the other hand, was built as an integrated logic-procedural
language right from the beginning and I do not see why the Ross's theorem
prover should not execute in two seconds (on an AT) in Trilogy.
I plan to program it in Trilogy as soon as I find time to do this
(and also introduce the bit operations to Trilogy).

By the way, the Prolog version of the program of Ross is an excellent
example of the considerable loss of readability due to the constant need of
using pairs of variables (old_state new_state). Moreover, the Prolog code
needed for the access and update of simulated arrays via the operator 'arg'
is distressingly pitiful.

Another area where updates to data structures occur are the file operations.
A data-base (whether internal, or external) is a predicate.
Any change to the data base (by asserts, retracts,
or by writes) changes a predicate. The technique of Bowen can be of course
used to explicate the situation. This technique can be presented in a quite
nice way by using a kind of dynamic logic as the
contribution of Sanjay Manchanda to this conference demonstrates.

Trilogy does not require the concept of worlds to to allow the
updates to the data bases. The input-output variables are used with the
advantage here.

Consider the external database Flight containing the triples
"flightno,origin,destination". Trilogy has built_in data base operations
and the query "Flight(x,Jfk,Ohare)" lists all New-York to Chicago flights
from and to the above airports. In other words, an external database
behaves as a predicate and can queried and joined as an ordinary predicate.

An update to the database means technically a creation of a new predicate.
One can either use a high order logic, or else use the goedel numbers
of predicates when passing them to other predicates as arguments.

Trilogy is based on the first order theory of pairs (S-expressions) and
uses the goedel-numbers of predicates. Goedel numbers of data-bases can
be of course given by lists since every data base is finite.

In other words a Trilogy data base D of type

   file T

where T is an arbitrary Trilogy type (generally a tuple) can be
explicitly queried as a predicate "D(t)" in a normal predicative format.

On the other hand, it can be passed as an argument to a deterministic procedure
P of Trilogy in the form P(D). This time the Goedel number of D, i.e. a
data structure of type "list T" is passed to the procedure P. A conversion
of a file to a list is used only for the logical explanation of updates.
The Trilogy implementation, of course passes a file handle to the predicate
P instead. The predicate P must be declared as follows:

  proc P(f:. file T) iff ..... { f is an input-output variable}

Note two things:

  1) A file argument f is an input-output variable of type file T
     Consequently the file f can be changed in place by
     using the same techniques as discussed with the state variables.

  2) The predicate P must be defined as a procedure, where the in place
     modifications to the file f are never trailed.

Trilogy supplies a set of file-operations for the processing of files
as arguments.  I mention just three of these here:
"Get(f,r)", Put(f,r), and Rewind(f). Get gets the current record from
the file f into the variable r and advances to the next one. Put puts
the record r into the current position of f and advances past it.
Rewind sets the file to the beginning.

The headings of these are as follows:

    proc Get(f:. file T, r:> T) iff ...  {r is output, i.e. returned}
    proc Put(f:. file T, r:> T) iff ...  {r is input, i.e. given}
    proc Rewind(f:. file T) iff ...

Since the current position in the file f matters, the logical explanation
of file values is actually as pairs of lists:

    (list T,list T)

the first list is the "processed part", whereas the second one, is the
"unprocessed part". The current record (if any) is at the head of the second
list. Thus the pair (p,Nil) represents a file at the EOF position.

The following gives the declarative reading to the above predicates:

   proc Get( (p_in,u_in),(p_out,u_out), r ) iff
     u_in = r,u_out & Append(p_in,(r,Nil),p_out)

   proc Put( (p_in,u_in),(p_out,u_out), r ) iff
     u_out = u_in & Append(p_in,(r,Nil),p_out)

   proc Rewind( (p_in,u_in),(p_out,u_out) ) iff
     p_out = Nil & Append(p_in,u_in,u_out) &

ok@quintus.UUCP (Richard A. O'Keefe) (03/23/88)

In article <1897@ubc-cs.UUCP>, voda@ubc-cs (Paul Voda) writes:
{ criticising Prolog }
> Quite often the state is encoded by a list, but even if it is a functor
> or a tree there is a significant amount of copying to be done in order
> to create a new state.

{ praising Trilogy }
> Trilogy is careful to preserve the logic in similar situation by copying out the
> value of y into the variable x.				   ^^^^^^^

When does Trilogy require copying?  Does it use reference counting,
as SETL used to, or is it a compile time analysis?

> Trilogy does not require the concept of worlds to to allow the
> updates to the data bases. The input-output variables are used with the
> advantage here.

But the Bowen approach permits the manipulation of several worlds at once.
(Something which isn't possible in vanilla Prolog either.)