[comp.lang.prolog] PROLOG Digest V5 #72

PROLOG-REQUEST@SUSHI.STANFORD.EDU (Chuck Restivo, The Moderator) (10/11/87)
PROLOG Digest            Monday, 12 Oct 1987       Volume 5 : Issue 72

Today's Topics:
         Query - Algebraic simplification & Prolog Machines,
                Implementation - Asthetics & findall/3
----------------------------------------------------------------------

Date: 2 Oct 87 10:07:36 GMT
From: Eddy Szulcm <cvax!dlt1!szulc@uunet.uu.net>
Subject: Algebraic simplification

I'm looking for an algorythm, preferably in Prolog, for simplifying
an algebraic expression. My first attempts can handle expressions of
the form 2+a+3 => 5+a, but gets lost with 2+a+3+b+4. I know that I
could flatten the trees, sort them and then re-construct them, but I
need something that works for all sorts of operators and that can be
defined by simple rules.

------------------------------

Date: 6 Oct 87 14:48:00 GMT
From: iuvax!silver!bondc@rutgers.edu
Subject: Re: Algebraic simplification algorythm


A-L-G-O-R-I-T-H-M

------------------------------

Date: 6 Oct 87 02:53:14 GMT
From:Mark Thomsen <trwrb!trwspf!thomsen@ucbvax.Berkeley.EDU>
Subject: Looking For Prolog Machines

Help me on a search if you can.

In our laboratory we have considerable interest in running things
quickly, and have developed and bought a number of fast non-general
processors for implementing certain functions or languages quickly.
We are preparing to work with Prolog for some embedded expert and
deductive processing functions, and I am not familiar with what can be
purchased to run Prolog programs fast.  If left in our current state
of ignorance, we will probably buy Modula-Prolog, put it on MC68020's
(where we have an excellent Modula-2 compiler), and do our own speed
enhancements.

What is out there that runs Prolog fast?  I have heard a little about
the ICOT effort on PSI, but have not technical article or description.
I have also seen articles in the past on extracting concurrency,
Concurrent Prolog, and the like -- are there concurrent Prolog
machines?

I would like any references or descriptions on Prolog machines that
you guys know of.  I will submit a summary of received data in a few
weeks.  Thanks!

-- Mark R. Thomsen

------------------------------

Date: 9 Oct 87 13:35:11 
Gcvax!dartvax!tedc@ucbvax.Berkeley.EDU  (Ted Cooley)
Subject: Re: Looking For Prolog Machines

There may also be the Xenologic Processor.  This machine is to be
an add-on to the SUN workstations.  Claims of up to 300Klips are claimed.

Xenologic is in Berkeley, CA.


------------------------------

Date: 9 Oct 87 13:38:05 GMT
From: mit-vax!jouvelot@eddie.mit.edu  (Pierre Jouvelot)
Subject: Re: Looking For Prolog Machines

This followup is from a friend (Akihiko Konagaya)

-- Pierre

It is true that PSI is rather slower. But it's not unfair to compare a
toy hardware and a workstation with a full programming environment.
Processor performance is considerablly diminished by various traps,
which are required for the environment.

As for performance, new PSI achieves 200-250 KLIPS. This is fairly
good because the performance of most useful application programs are
determined by the performance of built-in predicates but "append
program".  Futhermore, we know that 2 Mega LIPS is not so difficult,
if we develop a custom Prolog chip with 20 MHz.

The fastest Prolog machine currently available would be CHI-II
developed at NEC. It achieved 500KLIPS (See Habata etal, "Co-operative
High Performance Sequential Inference Machine" in ICCD '87 New York
for the details).  The full programming environment that includes
multi-window interface, mutiple process environment as well as
ordinary programming tools such as an optimizing compiler, an
interpreter with incremental compiler, would be available at the end
of 1987.

CHI-II provides an extended prolog, named SUPLOG, which is powerful
enough to implement whole system programs on CHI-II. It supports not
only whole prolog language features but also supports multiple name
space, interprocess communication facilities and various data types
(characters, arrays, streams etc). The subset of SUPLOG will be
available on a UNIX workstation soon.

The prototype CHI, oridinally named HPM, achieved 280 KLIPS. Its
architecture and an programming environment are reported in:

   Nakazaki etal, "Design of a high-speed prolog machine HPM",
   in Proc. of Computer Architecture, 1985.

   Konagaya etal, "A Co-Operative Programming Environment for a Back
   -End Type Sequential Inference Machine CHI", in Proc. of
   Parallel Algorithms and Architecture, 1987, Akademie-Verlag Berlin.

Copies are available until Jan. 1988 in the following address:

   Akihiko Konagaya
   MIT Laboratory for Computer Science,
   545 Technology Square, room 252,
   Cambridge, MA 02139

In case of after Jan. 1988, try to access:

   Akihiko Konagaya
   Computer System Laboratory,
   C&C Systems Laboratories,
   NEC Coporation, 1-1 Miyazaki, 4-chome, Miyamae,
   Kawasaki, Kanagawa, 213 Japan

------------------------------

Date: Thu, 1 Oct 87 19:46:40 PDT
From: quintus!cresswell!ok@Sun.COM (Richard A. O'Keefe)
Subject: findall/3


    This is really very embarrassing.  Lagache's library is such a
wonderful source of examples that I keep writing about it.  I am so
very sorry about this.  Here is a public-spirited sort of fellow, no
doubt expecting nothing but thanks, and here I am saying all sorts
of critical things about his code.  But what a wonderful source of
examples!  There are so many things to criticise, and everyone who
gets the Prolog Digest will know what I am talking about.

    This time, it isn't really his fault.  Lagache's version of
findall/3 has a serious bug, but this bug was present in the code
in Clocksin & Mellish which he copied, and in his copy he has at
least tried to reduce the chance of the bug biting.

    Here is the code from Clocksin & Mellish:

        findall(Template, Generator, List) :-
                asserta(found(mark)),
                call(Generator),
                asserta(found(Template)),
                fail.
        findall(Template, Generator, List) :-
                collect_found([], Matches),
                !,
                List = Matches.

        collect_found(Matches0, Matches) :-
                getnext(Match),
                !,
                collect_found([Match|Matches0], Matches).
        collect_found(Matches, Matches).

        getnext(Match) :-
                retract(found(Match)),
                !,
                Match \== mark.

    The bug, of course, is that Template musn't ever be bound to 'mark'.
The query
        ?- findall(X, member(X,[tom,dick,harry]), L).
works, but the query
        ?- findall(X, member(X,[matthew,mark,luke,john]), L).
doesn't work at all.

    Lagache has improved this code by using a much less plausible atom.
But what a pity that he didn't take the trouble to eliminate the bug
entirely.

    As a matter of fact, it isn't Clocksin & Mellish's fault either,
really.  The bug is present in DEC-10 Prolog.  Which is why I care
about the bug.  You see, I was bitten by it.  In all innocence, I
wrote a DEC-10 Prolog program which did something perfectly
sensible, but it broke mysteriously.  I just could not figure out
what was going wrong.  Fernando Pereira took one look at it, and saw
the problem at once.  The moral of the story is that there is no atom
so bizarre that a programmer may not encounter it quite innocently.

    How can we avoid the bug?  By systematic design in the first place.
There are two conditions we want to represent on the stack:
        - there is a marker here
        - there is a solution X here
How do you do that?  Always, Always, Always, the right way to represent
cases in Prolog is to have a different constructor function for each.
(You may have to deal with "anything else" cases in the input of your
programs, but you shouldn't design your own data structures that way.)
These two cases map DIRECTLY onto

        - marker
        - solution(X)

    Another point, which isn't exactly a bug, is that it is rather
regrettable to take such an obviously useful predicate name as
found/1 away from the user.  We don't need to do that.  (It is
far easier for a user to avoid predicate names with spaces in them
than to avoid an atom he isn't supposed to know about.)

    Accordingly, we can code findall/3 like this:

        findall(Template, Generator, List) :-
                asserta('findall stack'(marker)),
                call(Generator),
                asserta('findall stack'(solution(Template))),
                fail.
        findall(Template, Generator, List) :-
                'findall collect'([], List).

        'findall collect'(List0, List) :-
                retract('findall stack'(Item)),
                !,
                'findall collect'(Item, List0, List).

        'findall collect'(marker, List, List).
        'findall collect'(solution(X), List0, List) :-
                'findall collect'([X|List0], List).

This has one cut, not three, and no instances of \==, and not only
is it less ugly, it doesn't suffer from the bug.  In fact, if you
define another library predicate:

        retract_first(Clause) :-
                retract(Clause),
                !.

which is generally useful, we can see that the one surviving cut
hasn't really got anything to do with findall/3 as such.

    Much may be forgiven the pioneers of a field.  I do not fault
the designers of DEC-10 Prolog for this bug.  (They went on to
produce Quintus Prolog, which naturally has NOT got this bug.)
However, it has always been a bug, not a feature.

                +-------------------+
                |READ THIS PARAGRAPH|
                |EVEN IF NAUGHT ELSE|
                +-------------------+
    What is really important here is this systematic approach to
designing data structures.  If you want to represent situations
        s1 with data d11,...,d1k1
        ...
        sn with data dn1,...,dnkn
use terms
        s1(D11,...,D1k1)
        ...
        sn(Dn1,...,Dnkn)
to do it, so that you can tell later on what you meant by looking
at the principal functor.  If there is a most common case, it is
very tempting to ex-Lisp programmers to make that the "else" case
of an if-then-else (or a 'caseq') and avoid the "inefficiency" of
putting a wrapper around that case.  DON'T DO IT.  It will actually
make your programs LESS efficient (because there will be a 'variable'
case in your predicates, which will need cuts in all the preceding
clauses, which take time to execute, and you'll probably get
temporary choice-points you can do without).


    Lagache's code for findall/3 is a straightforward improvement
of Clocksin & Mellish's.  But his documentation is his own.  And
that is very strange.  Here is that documentation, retyped rather
sloppily so that you won't have to look it up:

        findall(Variable_atom, Predicate, Binding_list)

        generates a list of all values of Variable_atom
        that satisfy Predicate.

        findall is the classic predicate for collecting all
        the values of Variable_atom that are true of Predicate.
        The values are returned on list Binding_list.
        Both Binding_list and Variable_atom should be unbounded.

        Prolog code is based on Clocksin and Mellish p 162.
        The variable Variable_atom must be some atom in the
        Predicate call.  For example, to build up the list of
        integers between 1 and 10 the following call will
        do the trick:
                findall(X, between(1,10,X), Numberlist)


(a) What is a 'variable atom'?
    The first argument is usually a variable.  But it may be
    ANY term, and it is very useful to pass a skeletal term
    (a compound term with unbound arguments).  It even makes
    sense to pass a constant, sometimes.

    The argument need not be an atom "in the Predicate call".
    Presumably he means "the Generator must bind the Template
    to an atom".  In fact, in his own example, the Template
    gets bound to integers, not atoms.

    It is true that findall/3 doesn't make much sense if the
    Generator doesn't bind the Template to a ground term.
    But if you want to call
        findall(Name, (member(X,[fred,jim]),name(X,Name)), Names)
    go right ahead.  It will work.

(b) The second argument is a Goal, not a Predicate.
    Anything that is acceptable to call/1 is acceptable here,
    which is to say that anything which is acceptable as the
    body of a clause is acceptable.

(c) Is 1 really "true of" between(1,10,X) ?

(d) Presumably he means "unbound", not "unbounded".  In fact,
    there isn't the slightest need for the third argument to
    be unbound.  This is one of the few predicates where
    Lagache warns you about the need for output arguments to
    be unbound, and the irony is that findall/3 is steadfast!
    Suppose you wonder whether there are at least two proofs
    for a goal G.  It makes perfect sense to call
        findall(*, G, [*,*|_]).
    Go ahead and do it.  findall/3 won't mind.  It'll even work!


Lagache goes on to say in his documentation:
    LIMITATIONS: None.
        There are certainly fewer limitations than he thought.
    BUGS: No Known Bugs.
        Well, now it's known.


    To summarise this section, there is a serious problem with the
code that Lagache copied, but that isn't his fault.

    Lagache also offers a predicate countmatch/2.  There are
two things that
        countmatch(Goal, Count)
might have meant:
        how many SOLUTIONS are there?
        how many PROOFS could Prolog find?
His documentation makes it perfectly clear that he means the
latter.  There is also a sum_match/3.

    Let's look at his code.  (By the way, I can't thole run-on
words.  The underscore is there.  Let's use it.  The Melbourne
Mob prefer capitalisation, as in countMatch.  See Arthur Sale's
article in SP&E, or any good book on general programming style.
"ADA in Practice" has some good tips.)

        count_match(Generator, _) :-
                asserta(current_count(0)),
                call(Generator),
                increment_current_count,
                fail.
        count_match(_, Count) :-
                current_count(Count),
                retract(current_count(Count)).

        increment_current_count :-
                current_count(M),
                N is M+1,
                retract(current_count(M)),
                asserta(current_count(N)),
                !.

    Is there anything wrong with this?  Yes, it is neither
steadfast nor backwards correct.  If I ask whether a goal G
has exactly 2 proofs by calling
        count_match(G, 2)
and it turns out to have 3, what happens?  The call
current_count(2) in the second clause of count_match/2 fails,
and the counter is never retracted.  Is that so bad?  Yes,
because if this call to count_match/2 is dynamically nested
inside another call to count_match/2 (and why should it not
be?) the counter call will start using the inner call's
counter.  OH DEAR.

    Using the predicate retract_first/1 defined earlier, we
can code count_match/2 thus:

        count_match(Generator, _) :-
                asserta('count match'(0)),
                call(Generator),
                retract_first('count match'(Sum0)),
                Sum1 is Sum0+1,
                asserta('count match'(Sum1)),
                fail.
        count_match(_, Count) :-
                retract_first('count match'(Total)),
                Count = Total.

    To the best of my belief, this is a correct implementation of
the operation that Lagache documented.

    His sum_match/3 has a very serious bug:  he uses assertz/1
rather than asserta/1, which means that calls to sum_match/3
cannot be nested.  Since he got count_match/2 very nearly right,
this is probably a slip rather than a systematic error.  There
is a detail which is good: he checks that the number found by
the generator IS a number before trying to add it to the Sum.
But sum_match/3 has the same problem as count_match/2:

        zero_sum(Template, Generator) :-
                sum_match(Template, Generator, 0)

will do terrible things if the sum is NOT 0.  Again, recoding as

        sum_match(Template, Generator, _) :-
                asserta('sum match'(0)),
                call(Generator),
                integer(Template),
                retract_first('sum match'(Sum0)),
                Sum1 is Sum0+Template,
                asserta('sum match'(Sum1)),
                fail.
        sum_match(_, _, Sum) :-
                retract_first('sum match'(Total)),
                Sum = Total.

fixes the bug, and we can see by inspection that count_match(G, C)
is equivalent to sum_match(1, G, C).

    However, neither count_match/2 nor sum_match/3 is what we
usually want.

    The basic problem is the distinction between SOLUTIONS and PROOFS.
We typically want solutions, and if we want to build logical relations
solutions are what we need, but what Prolog finds is proofs.  Normally,
we couldn't care less how many proofs there are.

    Let's look at a very simple case.  We have a relation
        phone(Office, PhoneNumber)
and we'd like to be able to count the number of phones in an office.
Modulo the justified criticisms of setof/3, the right way to do this
is
        phone_count(Office, Count) :-
                setof(Phone, phone(Office,Phone), Phones),
                length(Phones, Count).

Given a particular phone, we can ask about how many phones it has.
Given a count, we can ask which offices have that many phones.
Or, if phone/2 can be enumerated, Prolog can enumerate pairs of
Offices and Counts.

    Now suppose phone/2 is defined like this:

        phone(Office, Number) :-
                occupant(Office, Person),
                extension(Person, Number).

Fine.  But if an office has two occupants,
        count_match(phone(Office,_), Count)
will return 2 when it should have returned 1.

    This is not a criticism of Lagache or his library.  He never
claimed that count_match/2 would be useful in a case like this.

    setof/3 and bagof/3 have a property which some people find
strange (who have never fully realised the difficulties implied
by the connection between setof(X, G, []) and ~G), namely that
they will never return an empty list.  There are useful hybrids
between {setof,bagof}/3 and findall/3:

        set_of_all(Template, Generator, Set) :-
                'sound query'(Generator, Template, Goal),
                findall(Template, Goal, List),
                sort(List, Set).

        bag_of_all(Template, Generator, Bag) :-
                'sound query'(Generator, Template, Goal),
                findall(Template, Goal, Bag).

        'sound query'(Exvar^Generator, Template, Goal) :- !,
                'sound query'(Generator, Exvar^Template, Goal).
        'sound query'(Goal, Template, Goal) :-
                numbervars(Template, 0, N1),
                numbervars(Goal, N1, N2),
                N2 > N1,
                !,
                /* report the error and only then */
                fail.
        'sound query'(Goal, _, Goal).

This checks that there are no variables in the Generator
which are not captured by the Template or existential quantifiers.
In that case, and ONLY in that case, it would be sound for
the routines to return an empty list of solutions.

        phone_count(Office, Count) :-
                set_of_all(Phone, phone(Office,Phone), Phones),
                length(Phones, Count).

will give the correct results, provided that Office is instantiated,
and if Office is not ground at the time of all, it will report an
error.  If you want to know how many phones are associated with
offices,
        set_of_all(P, Office^phone(Office,P), Phones)
will give you the set (list in standard order with no duplicates)
of such phones.

    It is very hard to find examples where count_match/3 or
sum_match/3 would be useful.  I have generally found it to be the
case that I wanted some other information about the results as
well.  I would be interested to see examples where some more
general operation would not be more useful.

    By the way, the Quintus library contains a predicate
aggregate/3, so that to find the mean area of the offices in
each department one could write

        mean_office_area(Dept, MeanArea) :-
            aggregate(sum(A)/sum(1),
                Office^(dept_office(Dept,Office), area(Office,A)),
                TotalArea/Count),
             MeanArea is TotalArea/Count.

This will of course enumerate departments with their mean areas
as well as finding the mean area of a given department.  It is
very easy to build such a predicate on top of setof/3, and for
operations like finding the average, minimum, or maximum,
refusing to yield an empty list is just what the doctor ordered.
It would be interesting to discuss what the aggregate/3 operation
should look like:  the present version is satisfactory but there
is more that it could do.

------------------------------

End of PROLOG Digest
********************