[comp.lang.c] Proof that Prolog can be faster than C

vanroy@pisces.uucp (06/13/90)

Benchmark timings:

Benchmark	Prolog  	C (no opt.)	C (opt. 4)

tak(24,16,8)    1.2		2.1		1.6

All three timings are in user seconds (with 'time') measured on a 25 MHz MIPS
processor.  The Prolog version is compiled with the Aquarius Prolog compiler
under development at Berkeley.  The C versions are compiled with the MIPS C
compiler, with no optimization and full optimization (level 4).

Disclaimer: this benchmark has a particularly easy translation to C.
The result does not necessarily hold for other programs.  But don't
you believe it anymore that Prolog is slow!

	Peter Van Roy

-------------------------------------------------------------------------------

/* C version of tak benchmark */

#include <stdio.h>

int tak(x,y,z)
int x, y, z;
{
  int a1, a2, a3;
  if (x <= y) return z;
  a1 = tak(x-1,y,z);
  a2 = tak(y-1,z,x);
  a3 = tak(z-1,x,y);
  return tak(a1,a2,a3);
}

main()
{
  printf("%d\n", tak(24, 16, 8));
}

-------------------------------------------------------------------------------

/* Prolog version of tak benchmark */

main :- tak(24,16,8,X), write(X), nl.

tak(X,Y,Z,A) :- X =< Y, Z = A.
tak(X,Y,Z,A) :- X > Y,
        X1 is X - 1, tak(X1,Y,Z,A1),
        Y1 is Y - 1, tak(Y1,Z,X,A2),
        Z1 is Z - 1, tak(Z1,X,Y,A3),
        tak(A1,A2,A3,A).

-------------------------------------------------------------------------------

christer@cs.umu.se (Christer Ericson) (06/13/90)

In article <36986@ucbvax.BERKELEY.EDU> vanroy@pisces.uucp () writes:
>
>[Stuff about some particular Prolog implementation being faster that some 
>other particular C implementation deleted]
>
>Disclaimer: this benchmark has a particularly easy translation to C.
>The result does not necessarily hold for other programs.  But don't
>you believe it anymore that Prolog is slow!
>

This is silly! Haven't you learned yet that no language is faster/slower than
another language - it's the implementation that's faster/slower than that
other implementation. The worst thing I know is people saying things like
C is faster than Pascal, because it isn't. Most C compilers generate faster
code than most Pascal compilers, true, but that's only because C heavily
depends upon (read allows) the programmer to optimize the code while the
Pascal compiler has to figure it all out by itself.

>	Peter Van Roy

/Christer

| Christer Ericson            Internet: christer@cs.umu.se [130.239.1.101] |
| Department of Computer Science, University of Umea, S-90187 UMEA, Sweden |
|                            "I bully sheep. I claim God doesn't exist..." |

richard@aiai.ed.ac.uk (Richard Tobin) (06/14/90)

>This is silly! Haven't you learned yet that no language is faster/slower than
>another language - it's the implementation that's faster/slower than that
>other implementation. 

I think Peter knows all about that.  There is a widespread belief that
languages like prolog can never be as fast as languages like C
(especially for numerical applications); the figures quoted are a
counterexample to that.

I would be interested to know if there are any Lisp compilers that
can produce comparable or better results. 

-- Richard
-- 
Richard Tobin,                       JANET: R.Tobin@uk.ac.ed             
AI Applications Institute,           ARPA:  R.Tobin%uk.ac.ed@nsfnet-relay.ac.uk
Edinburgh University.                UUCP:  ...!ukc!ed.ac.uk!R.Tobin

hawley@icot32.icot.or.jp (David John Hawley) (06/14/90)

In article <1990Jun13.100341.7148@cs.umu.se> christer@cs.umu.se (Christer Ericson) writes:
>This is silly! Haven't you learned yet that no language is faster/slower than
>another language - it's the implementation that's faster/slower than that

Well, of course the point is that Prolog is a much higher-level
language than C type languages, but we don't want to pay for all that
expressiveness all the time. I.e. the speed of our programs should
relate to the algorithms, not to the languages in which they are
expressed. So Prolog needs GREAT compilers and implementations, and
THEN we can write SHORT, SIMPLE, and FAST programs.

Well, I tried the given benchmark on my SUN/?:
	C version       	  - 8.7u
	SICstus Prolog(compiled)  - mem ovflw after > 1 min of real time

So how did he do it? Inquiring minds (and inquisitive noses) want to know.

---------------------------
David Hawley, ICOT, 4th Lab
csnet: hawley%icot.jp@relay.cs.net uucp:{enea,inria,mit-eddie,ukc}!icot!hawley
ICOT, 1-4-28 Mita, Minato-ku, Tokyo 108 JAPAN. TEL/FAX {81-3-456-}2514/1618

jwmills@iuvax.cs.indiana.edu (Jonathan Mills) (06/14/90)

From Peter Vanroy:

>Benchmark timings:
>
>Benchmark       Prolog          C (no opt.)     C (opt. 4)
>
>tak(24,16,8)    1.2             2.1             1.6
>fib(30)         1.5             2.0             1.6
>
>All three timings are in user seconds (with 'time') measured on a 25
>MHz MIPS processor.  The Prolog version is compiled with the
>Aquarius Prolog compiler under development at Berkeley.  The C
>versions are compiled with the MIPS C compiler, with no optimization
>and full optimization (level 4).
>
>Because older implementations of Prolog have been relatively slow,
>it is sometimes believed that this is inherent in the language.  The
>purpose of my original message was to dispel that notion by
>providing measurements of an optimizing Prolog compiler.



These are encouraging benchmarks, but should be generalized
cautiously.  First, to place the performance of the Aquarius Prolog
compiler in perspective, the times for the call:

	tak(18,12,6)

should be posted.  This would allow the direct comparison of the
times with those reported by by Richard P. Gabriel in "Performance
and Evaluation of Lisp Systems", MIT Press, 1974.  Comparing Prolog
and Lisp implementations is not unreasonable because of the
similarities between implementation techniques, particularly for
simple programs.



From Richard O'Keefe:

>This particular benchmark result tells us precisely two things:
>(1) _one_ Prolog system on that machine has cheaper procedure calls
>    than _one_ C compiler.  Good work!  But not very surprising
>     given that C compiler writers tend to worry more about other
>     things.
>
>(2) that particular Prolog system handles integer arithmetic better
>    than a lot of other Prolog systems.



Gabriel reports that the "significant Lisp-level operations" for
tak(18,12,6) are composed of:

     Calls to TAK                    63,609
     1-'s (decrements)               47,706

Since the Prolog code listed by Vanroy is a direct translation of the
TAK benchmark, these numbers and their significance _probably_ apply. 
The prevalence of function calls and decrements in this benchmark
limits its general significance.

The fib/1 benchmark is similarly likely to be a limited measure of
general performance since it is also composed primarily of function
calls (to FIB) and decrements (N - 1 and N - 2).



From Richard O'Keefe:

>It doesn't tell us how C and Prolog would compare on realistic
>applications, and there the pertinent question tends to be how easy
>it is to handle a good data structure.  I've an example where I
>speeded a program up quite substantially by translating from Fortran
>to Prolog, the speedup really came from using a much better data
>structure, but it was much easier to use that data structure in
>Prolog.



Gabriel has other benchmarks that are better suited to evaluate the
general performance of the Aquarius Compiler.  Two that appear well
suited are the BOYER and BROWSE benchmarks.

BOYER is a tiny theorem prover intended to give plausible estimates
of the performance of the Boyer-Moore theorem prover on various Lisp
systems.  BROWSE is a compendium of operations that an expert system
might use.  Both of these benchmarks could be translated to advantage
in Prolog, since they have a substantial amount of list manipulation
and pattern matching.

Ross Overbeek has a theorem prover whose performance in a variety of
Prologs was compared to C at a benchmarker's workshop held at The
Aerospace Corporation in 1988.  Carl Kesselman sponsored the
workshop, and may be able to provide source for a test.



From Richard Tobin:

>I would be interested to know if there are any Lisp compilers that
>can produce comparable or better results.



Gabriel has no results for a 25MHz MIPS machine (wasn't available),
but does list some other interesting results for TAK:

IMPLEMENTATION            CPU             RAW TIME (sec)
--------------            ---------       --------------
Portable Standard Lisp    Cray-1           0.044
Symbolics                 3600+IFU         0.43
Assembler                 68000            0.7
Franz Lisp                VAX 780          1.09
Common Lisp               VAX 785          1.18
Portable Standard Lisp    Sun/2 (68010)    1.44
Common Lisp               VAX 780          1.83
C                         68000            1.9
Franz Lisp                68000            2.37
INTERLISP                 Dolphin         11.12


It would also be interesting to compare the performance of commercial
Prolog compilers (ALS, BIM, Quintus) on various machines to see how
they handle TAK.  While older Prolog implementations were slow, newer
implementations have moved to native code, and impressive speeds. A
recent claim for ALS Prolog (using that hoary chestnut, NREV, sorry :-)
gave in excess of one megaLIPS on the 88000, and 100+ KLIPS on a Mac IIfx.
Perhaps they would be willing to key in TAK and post the timings.

In closing, designing and interpreting benchmarks offer tempting
targets for implementors and their critics.  Gabriel's book suggests
some strategies to follow (and avoid);  there are many other articles
out there also (Jack Dongarra has published advice for designers of
benchmarks for supercomputers, for example).

Vanroy's results are encouraging, and (hopefully) will lead to even
better comparisons.  Let's support his work:  I'd like to suggest
that anyone with additional "points" on the graph post their results
to the net.  This would help Prolog implementors today just as
Gabriel's work goaded Lisp implementors in the 1980's.

kdq@demott.COM (Kevin D. Quitt) (06/15/90)

In article <47658@iuvax.cs.indiana.edu> jwmills@iuvax.cs.indiana.edu (Jonathan Mills) writes:
>but does list some other interesting results for TAK:
>
>IMPLEMENTATION            CPU             RAW TIME (sec)
>--------------            ---------       --------------
>Portable Standard Lisp    Cray-1           0.044
>Symbolics                 3600+IFU         0.43
>Assembler                 68000            0.7
>Franz Lisp                VAX 780          1.09
>Common Lisp               VAX 785          1.18
>Portable Standard Lisp    Sun/2 (68010)    1.44
>Common Lisp               VAX 780          1.83
>C                         68000            1.9
>Franz Lisp                68000            2.37
>INTERLISP                 Dolphin         11.12
>

    Just for fun, on a Motorola Delta 3608 (25MHz 68030),

cc:	gnucc:

0.6	0.15	cc tak.c
0.3	0.3	cc -O tak.c


    and on a TwinHead 386/25, Microsoft C 6.0

0.28	/Ox /Gs
0.22	/Ox /Gsr

-- 
 _
Kevin D. Quitt         demott!kdq   kdq@demott.com
DeMott Electronics Co. 14707 Keswick St.   Van Nuys, CA 91405-1266
VOICE (818) 988-4975   FAX (818) 997-1190  MODEM (818) 997-4496 PEP last

                96.37% of all statistics are made up.

dankg@volcano.Berkeley.EDU (Dan KoGai) (06/16/90)

In article <36986@ucbvax.BERKELEY.EDU> vanroy@pisces.uucp () writes:
> [benchmark omitted]
>
>/* C version of tak benchmark */
>
>#include <stdio.h>
>
>int tak(x,y,z)
>int x, y, z;
>{
>  int a1, a2, a3;
>  if (x <= y) return z;
>  a1 = tak(x-1,y,z);
>  a2 = tak(y-1,z,x);
>  a3 = tak(z-1,x,y);
>  return tak(a1,a2,a3);
>}

	But this C code is hardly optimum.  1st of all you don't need
a1, a2, a3 to waste space and time. try the following.

int tak(int x, int y, int z)
{
	return (x <= y) ? z : tak(tak(--x, y, z),tak(--y, z, x),tak(--z, x, y))

}

> [Prolog version omitted]

	And it's also unfair to compare 2 versions without posting the
assembly codes both versions made.  Both C and Prolog allow programmers
to write various codes doing the same thing.

----------------
____  __  __    + Dan The Walking C Obfscunator
    ||__||__|   + E-mail:       dankg@ocf.berkeley.edu
____| ______    + Voice:        +1 415-549-6111
|     |__|__|   + USnail:       1730 Laloma Berkeley, CA 94709 U.S.A
|___  |__|__|   +
    |____|____  + "Unix is not insecure. It's people who are"
  \_|    |      + "Unix doesn't have pain.  It's people who do"

vanroy@pisces (Peter Van Roy) (06/16/90)

In article <1990Jun15.205242.16314@agate.berkeley.edu> dankg@volcano.Berkeley.EDU (Dan KoGai) writes:

>	But this C code is hardly optimum.  1st of all you don't need
>a1, a2, a3 to waste space and time. try the following.
>
>int tak(int x, int y, int z)
>{
>	return (x <= y) ? z : tak(tak(--x, y, z),tak(--y, z, x),tak(--z, x, y))
>
>}
>
>> [Prolog version omitted]
>
>____  __  __    + Dan The Walking C Obfscunator

Did you try to run this program?
It behaves very strangely:
(after adding a semicolon to the return statement)

	mammoth:~/Peter <20> cc -g taktak.c
	mammoth:~/Peter <21> dbx a.out
	dbx version 2.0
	Copyright 1988 MIPS Computer Systems Inc.
	Type 'help' for help.
	reading symbolic information ...
	main:  15  printf("%d\n", tak(24, 16, 8));
	(dbx) run
	^C				<- interrupted after ~10 seconds
	Interrupt [tak:10 +0x28,0x400220]
        	tak(--z, x, y));
	(dbx) print x,y,z
	-44896 -44897 -44898		<- strange values

From page 186 of K&R:
(note that predecrement does a side-effect)

	[Regarding function evaluation] The order of evaluation of arguments
	is undefined by the language; take note that the various compilers
	differ.

By the way, I did try a different version of tak.c without the temporary
variables a1, a2, and a3.  It ran *slower* unoptimized and at exactly the
same speed optimized.

Please think before you post.
Sincerely,
	Peter Van Roy

grover@brahmand.Eng.Sun.COM (Vinod Grover) (06/16/90)

In article <1990Jun15.205242.16314@agate.berkeley.edu> dankg@volcano.Berkeley.EDU (Dan KoGai) writes:
>In article <36986@ucbvax.BERKELEY.EDU> vanroy@pisces.uucp () writes:
>a1, a2, a3 to waste space and time. try the following.
>
>int tak(int x, int y, int z)
>{
>  return (x <= y) ? z : tak(tak(--x, y, z),tak(--y, z, x),tak(--z, x, y))
>}
This program has a missing semicolon, and runs into an infinite loop. 
C does not require the side-effect in --x to happen immediately. You must
force the subtraction. Furthermore, eliminating explicit temp variables does
not mean that the compiler won't put them back in. However, declaring those
explicit temps to be register variables *may* improve the program, but is
not guarenteed to do so. 

Vinod Grover