[net.lang.prolog] PROLOG Digest V4 #43

PROLOG-REQUEST@SU-SCORE.ARPA (Chuck Restivo, The Moderator) (08/26/86)
PROLOG Digest           Wednesday, 27 Aug 1986     Volume 4 : Issue 43

Today's Topics:
                         LP Library - Update,
              Performance - Benchmarking Systems & Part1
----------------------------------------------------------------------

Date: Mon 25 Aug 86 20:51:07-PDT
From: Chuck Restivo (The Moderator) <Restivo@SU-SCORE.ARPA>
Subject: LP Library

[cwr]

I am planning to post a suite of logic programming performance
benchmark programs for this issue and the next two consecutive
issues of the Digest.

My apologies to those that are offended by such large transmissions.

-- Chuck

------------------------------

Date: Tue, 24 Jun 86 15:47:43 -0100
From: Jean Claude Syre <jclaude%ecrcvax.uucp@germany.csnet>
Subject: Benchmarking Prolog Systems (part 1)


         ***********************************************
         *** BENCHMARK PROGRAMS FOR PROLOG SYSTEMS   ***
         ***    (FINAL VERSION)                      ***
         ***********************************************
                       Part 1 (of 3)

-- J.C. SYRE
   ECRC (European Computer-Industry Research Center)
   Arabellastr. 17
   D-8000 MUNICH 81
   WEST GERMANY

This set of benchmark programs is a collective work done
by the Logic Programming Group and the Computer Architecture
Group of ECRC, the European Computer-Industry Research Centre,
in Munich, AS WELL AS BY OTHER PEOPLE HAVING RESPONDED TO OUR
FIRST PROPOSAL SOME TIME AGO. Many thanks to all people who
have helped us improving this benchmark.

For convenience, you can send messages to Hans Benker
(replace "jclaude" by "hans" in this net address), or
myself.

The designers of the benchmark programs are: H. Benker, J. Noye,
Micha Meier, S. Schmitz, J.C. Syre, and ****many others**** from
ECRC and from other places in the world who brought many useful
comments.

The first section deals with simple programs whose single purpose is
to evaluate a single feature of prolog execution.  The times we give
correspond to an interpretation using CProlog on a VAX11/785 under
UNIX BSD4.2, in a "quiet" environment (which may be obtained on a
sunny Sunday with all other logins prohibited). They are subject
to a 10 percent inaccuracy due to paging and timecounts.

The second section presents more complex programs, which may
still run on small Prolog systems. We are open to suggestions
to improve the significance of the benchmark. The programs
include many of the programs taken by the University of
Berkeley for their evaluation of the PLM1 prolog machine.

There should be a third section, that we have not fully built
yet, and we count on you to build it. Those programs should be
representative of real scale prototypes of large Prolog
applications. CHAT80 is an example of such programs for this
section. If you feel embarassed to propagate one of your programs
having properties that you are reluctant to make public, you may
think of modifying it or truncate it so that it becomes hardly
readable, or useless for actual use by others.

THIS IS OUR FINAL VERSION. We have taken into account a lot of
remarks made by those of you who responded to our proposal of
Prolog benchmark made two months ago. Not all of your
suggestions have been included (mainly to keep the benchmark
within a reasonable size), but many of them.

We are expecting a large number of answers from you. We can
also send the benchmark by normal mail if necessary.

IN ORDER TO KEEP THE POSSIBILITY OF COMPARING RESULTS FROM
VARIOUS SOURCES AND SYSTEMS, YOU ARE KINDLY ASKED NOT TO CHANGE
THE SOURCE PROGRAMS. SEND YOUR RESULTS TO OUR ADDRESS. WE WILL
MAKE A COMPILATION OF THEM, AND SEND THEM TO THE NET, UNLESS
YOU DO NOT WISH IT (IN THAT CASE, PLEASE MENTION IT
EXPLICITELY).

Have fun !



1. Simple Benchmark programs.

These simple (or simplistic) programs aim at evaluating a
single feature of the Prolog System. Here a Prolog System is
understood as either a pair <Prolog software, Host machine>,
where the Prolog software is an interpreter, a compiler, a
combination of both, and the Host machine is a conventional
machine (with its operating system and workload),
a simulator of a prolog processor, or a real piece of
Prolog hardware (direct interpreter, or PLM processor, or
anything else).

The "single feature" mentioned above means that the performance
results will show how well the Prolog System can handle a
particular characteristic of the language.

The phenomena we measure are:

 o   calls (section 1.1)
         program: boresea(N))
 o   non-determinism (section 1.2)
         programs: choice_point(N),
         choice_point0ar(N), baktrak1(N), baktrak2(N)
 o   handling of environments (section 1.3)
         programs: envir(N), envir0ar(N)
 o   indexing (section 1.4)
         program: index(N)
 o   unification (section 1.5)
         programs: construct_list(N), match_list(N),
         construct_structure(N), match_structure(N),
         match_nested_structure(N), general_unification(N)
 o   dereferencing (section 1.6)
         program: deref(N)
 o   cut (section 1.7)
         program: cuttest(N)

There are many more which would be interesting to measure, e.g.
efficiency of built-ins, "assert" and "retract", I/O and tail
recursion optimisation. However for now the above 7 criteria
seemed to be the most interesting and maybe somebody on
the net can design benchmarks for the remaining features.

Measuring a single feature of a language is difficult. One single
execution of a tiny program testing a particular feature takes
not enough time to measure it precisely. To get a better precision
one has to execute the test program hundreds of times. There are
two ways to do this: Write down the test program as often as one
wants to execute it or include it in a loop. The first solution
implies that one has to write programs with hundreds of lines
of code, where each line does the same job.

This is not convenient and it is desirable to use loops. In the
case of our benchmark programs however, the time spent executing
the loop is not negligible, due to the very small size of our test
programs. Therefore we used a combination of both methods, i.e.
sequences of repeated code surrounded with a loop. In order to
minimise the effect of the loop, we actually run as well an "empty"
loop, without the benchmark program. We call this "compensation
loop" and subtract its execution time  from the execution time of
the loop including the benchmark program. This increases of course
the relative error on the time measurement, but we have decreased
the influence of the unavoidable loop.

The repeated code can be generated by your favorite editor.
How much repeated code you need to get a sufficient precision of
course depends on the implementation of your particular Prolog
system.  However we put as much repeated code into each benchmark
program as we think is apropriate to most Prolog implementations.
So we think you should get sufficient precision without modifying
our programs in that respect.
The listings of the programs follow below. For each simple program
we try to give the characteristics of it and some remarks about
what it measures.

Note that "cputime" in C-Prolog on the VAX gives you the possibility
to measure runtime. This may be different in other Prolog systems.
All the rest of the programs should be portable to any other system
without any problem.


1.1. Program to test calls  (boresea).

This is the one you always dreamed to test! Like all benchmarks,
it uses a loop calling the actual benchmark program. The benchmark
program consists of a sequence of 200 predicates having no arguments,
no choice points, NOTHING. 200 is chosen to have sufficient accuracy
in measuring the execution time.

The results show the effect of pure calls, and the Klips performance
can be called the peak performance of the prolog system. Note
that the peak performance has very little significance to classify
the overall performance of a Prolog system.

---------------- cut here - beginning of program listing ----------

/* This program is called with the query "?-boresea(X)."         */
/* X is the number of loop iterations executed. It should be big */
/* enough to give significant results.                           */
/* suggested value for X: 100 for interpreted code*/
/*                       1000 for compiled code   */
/* average values for C-prolog interpreter:       */
/*       X=1000, Tloop=27.1 T.comp=1.0 Tnet=26.1 Klips=7.7 */

boresea(X)
     :- T1 is cputime,
        do_max_KLips(X),       /* calls the loop to execute the  */
        T2 is cputime,         /* sequence of 200 predicates     */
        compens_loop(X),       /* compensation loop              */
        T3 is cputime,
        print_times(T1,T2,T3,X,200)./*compute and print results  */


compens_loop(0).                    /* compensation loop         */
compens_loop(X) :- Y is X - 1, compens_loop(Y).

print_times(T1,T2,T3,X,I) :-        /* prints the results        */
        TT1 is T2 - T1,
        TT2 is T3 - T2,
        TT is TT1 - TT2,
        write('T overall loop:   '),write(TT1), nl,
        write('T compens loop:   '),write(TT2), nl,
        write('T net:            '),write(TT),nl,
        write('KLips:            '),
        Li is I * X,
        Lips is Li / TT,
        KLips is Lips / 1000,
        write(KLips),nl,nl.

do_max_KLips(0).            /* loop calling the actual benchmark */
do_max_KLips(X) :- lips1, Y is X - 1, do_max_KLips(Y).

/* predicates to test call */

lips1 :- lips2.
lips2 :- lips3.
lips3 :- lips4.
lips4 :- lips5.
lips5 :- lips6.
lips6 :- lips7.
lips7 :- lips8.
lips8 :- lips9.
lips9 :- lips10.
lips10 :- lips11.
lips11 :- lips12.
lips12 :- lips13.
lips13 :- lips14.
lips14 :- lips15.
lips15 :- lips16.
lips16 :- lips17.
lips17 :- lips18.
lips18 :- lips19.
lips19 :- lips20.
lips20 :- lips21.
lips21 :- lips22.
lips22 :- lips23.
lips23 :- lips24.
lips24 :- lips25.
lips25 :- lips26.
lips26 :- lips27.
lips27 :- lips28.
lips28 :- lips29.
lips29 :- lips30.
lips30 :- lips31.
lips31 :- lips32.
lips32 :- lips33.
lips33 :- lips34.
lips34 :- lips35.
lips35 :- lips36.
lips36 :- lips37.
lips37 :- lips38.
lips38 :- lips39.
lips39 :- lips40.
lips40 :- lips41.
lips41 :- lips42.
lips42 :- lips43.
lips43 :- lips44.
lips44 :- lips45.
lips45 :- lips46.
lips46 :- lips47.
lips47 :- lips48.
lips48 :- lips49.
lips49 :- lips50.
lips50 :- lips51.
lips51 :- lips52.
lips52 :- lips53.
lips53 :- lips54.
lips54 :- lips55.
lips55 :- lips56.
lips56 :- lips57.
lips57 :- lips58.
lips58 :- lips59.
lips59 :- lips60.
lips60 :- lips61.
lips61 :- lips62.
lips62 :- lips63.
lips63 :- lips64.
lips64 :- lips65.
lips65 :- lips66.
lips66 :- lips67.
lips67 :- lips68.
lips68 :- lips69.
lips69 :- lips70.
lips70 :- lips71.
lips71 :- lips72.
lips72 :- lips73.
lips73 :- lips74.
lips74 :- lips75.
lips75 :- lips76.
lips76 :- lips77.
lips77 :- lips78.
lips78 :- lips79.
lips79 :- lips80.
lips80 :- lips81.
lips81 :- lips82.
lips82 :- lips83.
lips83 :- lips84.
lips84 :- lips85.
lips85 :- lips86.
lips86 :- lips87.
lips87 :- lips88.
lips88 :- lips89.
lips89 :- lips90.
lips90 :- lips91.
lips91 :- lips92.
lips92 :- lips93.
lips93 :- lips94.
lips94 :- lips95.
lips95 :- lips96.
lips96 :- lips97.
lips97 :- lips98.
lips98 :- lips99.
lips99 :- lips100.
lips100:- lips101.
lips101 :- lips102.
lips102 :- lips103.
lips103 :- lips104.
lips104 :- lips105.
lips105 :- lips106.
lips106 :- lips107.
lips107 :- lips108.
lips108 :- lips109.
lips109 :- lips110.
lips110 :- lips111.
lips111 :- lips112.
lips112 :- lips113.
lips113 :- lips114.
lips114 :- lips115.
lips115 :- lips116.
lips116 :- lips117.
lips117 :- lips118.
lips118 :- lips119.
lips119 :- lips120.
lips120 :- lips121.
lips121 :- lips122.
lips122 :- lips123.
lips123 :- lips124.
lips124 :- lips125.
lips125 :- lips126.
lips126 :- lips127.
lips127 :- lips128.
lips128 :- lips129.
lips129 :- lips130.
lips130 :- lips131.
lips131 :- lips132.
lips132 :- lips133.
lips133 :- lips134.
lips134 :- lips135.
lips135 :- lips136.
lips136 :- lips137.
lips137 :- lips138.
lips138 :- lips139.
lips139 :- lips140.
lips140 :- lips141.
lips141 :- lips142.
lips142 :- lips143.
lips143 :- lips144.
lips144 :- lips145.
lips145 :- lips146.
lips146 :- lips147.
lips147 :- lips148.
lips148 :- lips149.
lips149 :- lips150.
lips150 :- lips151.
lips151 :- lips152.
lips152 :- lips153.
lips153 :- lips154.
lips154 :- lips155.
lips155 :- lips156.
lips156 :- lips157.
lips157 :- lips158.
lips158 :- lips159.
lips159 :- lips160.
lips160 :- lips161.
lips161 :- lips162.
lips162 :- lips163.
lips163 :- lips164.
lips164 :- lips165.
lips165 :- lips166.
lips166 :- lips167.
lips167 :- lips168.
lips168 :- lips169.
lips169 :- lips170.
lips170 :- lips171.
lips171 :- lips172.
lips172 :- lips173.
lips173 :- lips174.
lips174 :- lips175.
lips175 :- lips176.
lips176 :- lips177.
lips177 :- lips178.
lips178 :- lips179.
lips179 :- lips180.
lips180 :- lips181.
lips181 :- lips182.
lips182 :- lips183.
lips183 :- lips184.
lips184 :- lips185.
lips185 :- lips186.
lips186 :- lips187.
lips187 :- lips188.
lips188 :- lips189.
lips189 :- lips190.
lips190 :- lips191.
lips191 :- lips192.
lips192 :- lips193.
lips193 :- lips194.
lips194 :- lips195.
lips195 :- lips196.
lips196 :- lips197.
lips197 :- lips198.
lips198 :- lips199.
lips199 :- lips200.
lips200.

--------------------cut here - end of program listing------------


1.2. Program to test non deterministic behaviour

This program contains a series of 3 different benchmark predicates.

The predicate "choice_point(N)" tests calls invoking the creation
of a choice point, i.e. a branch point where the execution
will possibly come back to in case of backtracking. It
does NOT backtrack. Two versions are proposed, one with and the
other without arguments.

We then present two predicates to evaluate the mechanism of
backtracking during execution. Both predicates create one
choice_point and then backtrack 20 times on every loop iteration
step. "baktrak1(N)" exhibits a kind of backtracking called "deep",
while "baktrak2(N)" deals with "shallow" backtracking.
Both are worth being tried, whatever your particular Prolog
System is.

------------cut here - beginning of program listing----------------
/* The predicates are called:                             */

/*    o  "choice_point(N)"    - creation of choice points    */
/*    o  "choice_point0ar(N)  - same, with 0 arg             */
/*    o  "baktrak1(N)"        - deep backtracking            */
/*    o  "baktrak2(N)"        - shallow backtracking         */

/*  N is the number of loop iterations executed  */


/* predicate to test creation of choice points without
   baktracking */

/* suggested value for N: 1000 */
/* results for  Cprolog N=1000 */
/* Tloop=5.95 Tcompens=0.98 Tnet=4.97 Klips=4.02 */

choice_point(N):-T1 is cputime,
        cre_CP(N), T2 is cputime,
        compens_loop(N), T3 is cputime,
        print_times(T1,T2,T3,N,20).

/* predicate choice_point, but with zero argument  */
/* suggested value for N: 1000 */
/* results for Cprolog: N=1000 */
/* Tloop=3.55 Tcompens=0.98 Tnet=2.57 Klips=7.7  */


choice_point0ar(N):-T1 is cputime,
        cre_CP0ar(N), T2 is cputime,
        compens_loop(N), T3 is cputime,
        print_times(T1,T2,T3,N,20).
/* Predicate to test the (deep) backtracking mechanism. */
/* suggested value for N: 1000 (interp), 2000(comp) */
/* results for Cprolog: N=1000  */
/* Tloop=9.63 Tcomp=1 Tnet=8.63 Klips=2.32  */

baktrak1(N)
     :- T1 is cputime,
        deep_back(N),
        T2 is cputime,
        compens_loop(N),
        T3 is cputime,
        print_times(T1,T2,T3,N,20).


/* Predicate to test the (shallow) backtracking mechanism */
/* suggested value for N: 1000 (interp), 2000 (comp) */
/* results for Cprolog: N=1000  */
/* Tloop=3.63  Tcomp=0.95 Tnet=2.68 Klips=7.45 */

baktrak2(X)
     :- T1 is cputime,
        shallow_back(X), T2 is cputime,
        compens_loop(X), T3 is cputime,
        print_times(T1,T2,T3,X,20).


/* compensation loop, used to measure the time spent in
   the loop  */

compens_loop(0).
compens_loop(X) :- Y is X - 1, compens_loop(Y).

/* loop to test choice point creation   */
cre_CP(0).
cre_CP(N):-M is N-1, ccp1(0,0,0), cre_CP(M).

cre_CP0ar(0).
cre_CP0ar(N):-M is N-1, ccp1, cre_CP0ar(M).

/* loop to test deep backtracking       */
deep_back(0).
deep_back(X) :- pd(_,_,_), Y is X - 1, deep_back(Y).

/* loop to test shallow backtracking */
shallow_back(0).
shallow_back(X) :- ps(_,a,b), Y is X - 1, shallow_back(Y).


print_times(T1,T2,T3,X,I) :-        /* prints the results */
        TT1 is T2 - T1,
        TT2 is T3 - T2,
        TT is TT1 - TT2,
        write('T overall loop:   '),write(TT1), nl,
        write('T compens loop:   '),write(TT2), nl,
        write('T net:            '),write(TT),nl,
        write('KLips:            '),
        Li is I * X,
        Lips is Li / TT,
        KLips is Lips / 1000,
        write(KLips),nl,nl.

/*  ccp1 creates 20 choice points */
/* ccp1 is the beginning of a set of predicates  */
/* composed of 2 clauses each. Every invokation of nd0 will create */
/* a sequence of 20 choice points. The body of the clauses are     */
/* limited to one goal, thus avoiding a creation of environment    */
/* when the clause is activated. nd0, and its successors, have     */
/*   three arguments to comply with our average static analysis    */
/*   results made on more than 30 real Prolog programs.            */
/* ccpXX exists with 3 arguments, and 0 args. */

ccp1(X,Y,Z):-ccp2(X,Y,Z).
ccp1(X,Y,Z).
ccp2(X,Y,Z):-ccp3(X,Y,Z).
ccp2(X,Y,Z).
ccp3(X,Y,Z):-ccp4(X,Y,Z).
ccp3(X,Y,Z).
ccp4(X,Y,Z):-ccp5(X,Y,Z).
ccp4(X,Y,Z).
ccp5(X,Y,Z):-ccp6(X,Y,Z).
ccp5(X,Y,Z).
ccp6(X,Y,Z):-ccp7(X,Y,Z).
ccp6(X,Y,Z).
ccp7(X,Y,Z):-ccp8(X,Y,Z).
ccp7(X,Y,Z).
ccp8(X,Y,Z):-ccp9(X,Y,Z).
ccp8(X,Y,Z).
ccp9(X,Y,Z):-ccp10(X,Y,Z).
ccp9(X,Y,Z).
ccp10(X,Y,Z):-ccp11(X,Y,Z).
ccp10(X,Y,Z).
ccp11(X,Y,Z):-ccp12(X,Y,Z).
ccp11(X,Y,Z).
ccp12(X,Y,Z):-ccp13(X,Y,Z).
ccp12(X,Y,Z).
ccp13(X,Y,Z):-ccp14(X,Y,Z).
ccp13(X,Y,Z).
ccp14(X,Y,Z):-ccp15(X,Y,Z).
ccp14(X,Y,Z).
ccp15(X,Y,Z):-ccp16(X,Y,Z).
ccp15(X,Y,Z).
ccp16(X,Y,Z):-ccp17(X,Y,Z).
ccp16(X,Y,Z).
ccp17(X,Y,Z):-ccp18(X,Y,Z).
ccp17(X,Y,Z).
ccp18(X,Y,Z):-ccp19(X,Y,Z).
ccp18(X,Y,Z).
ccp19(X,Y,Z):-ccp20(X,Y,Z).
ccp19(X,Y,Z).

ccp20(X,Y,Z).
ccp20(X,Y,Z).

ccp1:-ccp2.
ccp1.
ccp2:-ccp3.
ccp2.
ccp3:-ccp4.
ccp3.
ccp4:-ccp5.
ccp4.
ccp5:-ccp6.
ccp5.
ccp6:-ccp7.
ccp6.
ccp7:-ccp8.
ccp7.
ccp8:-ccp9.
ccp8.
ccp9:-ccp10.
ccp9.
ccp10:-ccp11.
ccp10.
ccp11:-ccp12.
ccp11.
ccp12:-ccp13.
ccp12.
ccp13:-ccp14.
ccp13.
ccp14:-ccp15.
ccp14.
ccp15:-ccp16.
ccp15.
ccp16:-ccp17.
ccp16.
ccp17:-ccp18.
ccp17.
ccp18:-ccp19.
ccp18.
ccp19:-ccp20.
ccp19.

ccp20.
ccp20.


/*  deep backtracking */
/*  The call to pd creates a choice point, and invokes a      */
/*  call to q. It will fail and there will be a backtracking  */
/*  step  to try the next clause defining pd. pd has 21       */
/*  clauses,thus failure                                      */
/*  occurs 20 times                                           */

pd(X1,X2,_) :- q(X1,X2,a).
pd(X1,X2,_) :- q(X1,X2,a).
pd(X1,X2,_) :- q(X1,X2,a).
pd(X1,X2,_) :- q(X1,X2,a).
pd(X1,X2,_) :- q(X1,X2,a).
pd(X1,X2,_) :- q(X1,X2,a).
pd(X1,X2,_) :- q(X1,X2,a).
pd(X1,X2,_) :- q(X1,X2,a).
pd(X1,X2,_) :- q(X1,X2,a).
pd(X1,X2,_) :- q(X1,X2,a).
pd(X1,X2,_) :- q(X1,X2,a).
pd(X1,X2,_) :- q(X1,X2,a).
pd(X1,X2,_) :- q(X1,X2,a).
pd(X1,X2,_) :- q(X1,X2,a).
pd(X1,X2,_) :- q(X1,X2,a).
pd(X1,X2,_) :- q(X1,X2,a).
pd(X1,X2,_) :- q(X1,X2,a).
pd(X1,X2,_) :- q(X1,X2,a).
pd(X1,X2,_) :- q(X1,X2,a).
pd(X1,X2,_) :- q(X1,X2,a).
pd(X1,X2,_).

q(X1,X2,b).


/* shallow backtracking */
/* The ps predicate fails 20 times. The shallow backtracking   */
/* will not restore all current state registers in Prolog      */
/* systems which perform this optimisation, while others will. */

ps(_,X,X).
ps(_,X,X).
ps(_,X,X).
ps(_,X,X).
ps(_,X,X).
ps(_,X,X).
ps(_,X,X).
ps(_,X,X).
ps(_,X,X).
ps(_,X,X).
ps(_,X,X).
ps(_,X,X).
ps(_,X,X).
ps(_,X,X).
ps(_,X,X).
ps(_,X,X).
ps(_,X,X).
ps(_,X,X).
ps(_,X,X).
ps(_,X,X).
ps(_,_,_).

---------------------cut here - end of program listing-------------

------------------------------

End of PROLOG Digest
********************