[comp.lang.prolog] Bogus!

merlyn@iwarp.intel.com (Randal Schwartz) (06/13/90)

In article <36986@ucbvax.BERKELEY.EDU>, vanroy@pisces writes:
| 
| Benchmark timings:
| 
| Benchmark	Prolog  	C (no opt.)	C (opt. 4)
| 
| tak(24,16,8)    1.2		2.1		1.6

Bogus.

Comparing apples and oranges dude.

The prolog version most likely saves results from tak(a,b,c) so that
they can do a quick lookup the next time around.  I jimmied up one
in perl that does the same thing...

$ cat tak
#!/local/usr/bin/perl

sub tak {
	$takcalled++;
	local($x,$y,$z) = @_;
	return $z if $x <= $y;
	if (defined $tak{$x,$y,$z}) {
		$takrecalled++;
		# print "recomputing $x $y $z\n";
		return $tak{$x,$y,$z};
	}
	local($a1,$a2,$a3);
	$a1 = &tak($x-1,$y+0,$z+0);
	$a2 = &tak($y-1,$z+0,$x+0);
	$a3 = &tak($z-1,$x+0,$y+0);
	return $tak{$x,$y,$z} = &tak($a1+0,$a2+0,$a3+0);
}

$start = (times)[0];

$result = &tak(24,16,8);

$finish = (times)[0];

$seconds = $finish - $start;

print	"result $result, ",
	"called $takcalled, ",
	"recalled $takrecalled, ",
	"time $seconds secs\n";
$ ./tak
result 9, called 941, recalled 194, time 0.91666666666666662966 secs

Now, tell me if 0.917 seconds of user time isn't better than either C
or prolog!  (This is on a sparc..., so there's some more apples and
oranges for ya :-)

Notice how many *recalled* short-circuits I took.  The C code would
have had to have computed that out fully.  But once prolog "knew" the
fact of tak(11,14,17,96) (or whatever), it'd never compute it again.

Stop comparing apples and oranges, unless you are willing to throw in
a few Perls. :-)

(Something with a little less smoke and mirrors is to compare timings
on a factorial written with recursion.  Trace it through slowly, and
you'll see what I mean.)

Just another Perl hacker,
-- 
/=Randal L. Schwartz, Stonehenge Consulting Services (503)777-0095 ==========\
| on contract to Intel's iWarp project, Beaverton, Oregon, USA, Sol III      |
| merlyn@iwarp.intel.com ...!any-MX-mailer-like-uunet!iwarp.intel.com!merlyn |
\=Cute Quote: "Welcome to Portland, Oregon, home of the California Raisins!"=/

debray@cs.arizona.edu (Saumya K. Debray) (06/13/90)

merlyn@iwarp.intel.com (Randal Schwartz) writes:
> In article <36986@ucbvax.BERKELEY.EDU>, vanroy@pisces writes:
> | 
> | Benchmark timings:
> | 
> | Benchmark	Prolog  	C (no opt.)	C (opt. 4)
> | 
> | tak(24,16,8)    1.2		2.1		1.6
> 
> Bogus.
> 
> Comparing apples and oranges dude.
> 
> The prolog version most likely saves results from tak(a,b,c) so that
> they can do a quick lookup the next time around.

Look -- a program either does something, or it doesn't [*].  To say that
it "most likely" does something is basically the same as saying 
"I {didn't bother to | couldn't} understand the code, but here's a
wild guess", except that it's not quite as honest.

For the benchmark in question, Van Roy posted both the C and the Prolog
sources, so it's easy to verify whether or not there's any table lookups
going on.  There isn't.

[*] Unless, I suppose, we're talking about randomization or true
    nondeterminism.  In this case, we aren't.

-- 
Saumya Debray		CS Department, University of Arizona, Tucson

     internet:   debray@cs.arizona.edu
     uucp:       uunet!arizona!debray

vanroy@pisces (Peter Van Roy) (06/14/90)

In article <1990Jun12.212525.25551@iwarp.intel.com> merlyn@iwarp.intel.com (Randal Schwartz) writes:
>
>Bogus.
>
>Comparing apples and oranges dude.
>
>The prolog version most likely saves results from tak(a,b,c) so that
>they can do a quick lookup the next time around.  I jimmied up one
>in perl that does the same thing...
>
>Just another Perl hacker,
>/=Randal L. Schwartz, Stonehenge Consulting Services (503)777-0095 ==========\

Please do not jump to conclusions.  The Prolog implementation does *NOT* save
intermediate results from tak(a,b,c).  It is doing *ALL* recursive calls.
(I wish the compiler were intelligent enough to do what you say it does :-).
The speed is due to straightforward compiler optimization.  By the way, the
behavior of tak is not an isolated case.  For example (times in user seconds,
as before):

Benchmark       Prolog          C (no opt.)     C (opt. 4)

fib(30)         1.5             2.0             1.6

This program is doing all recursive calls too.

Because older implementations of Prolog have been relatively slow, it is
sometimes believed that this is inherent in the language.  The purpose of
my original message was to dispel that notion by providing measurements of
an optimizing Prolog compiler.

Recent work in Prolog implementation has significantly improved performance.
You might want to look at "Fast Prolog with an Extended General Purpose
Architecture" by Bruce Holmer et al in the 17th International Symposium on
Computer Architecture and "LIPS on a MIPS: Results from a Prolog Compiler for
a RISC" by Andrew Taylor in the 1990 International Conference on Logic
Programming.

Sincerely,
  Peter Van Roy

-------------------------------------------------------------------------------

/* C version of fib benchmark */

#include <stdio.h>

int fib(x)
int x;
{
  if (x <= 1) return 1;
  return (fib(x-1)+fib(x-2));
}

main()
{
  printf("%d\n", fib(30));
}

-------------------------------------------------------------------------------

/* Prolog version of fib benchmark */

main :- fib(30,N), write(N), nl.

fib(N,F) :- N =< 1, F = 1.
fib(N,F) :- N > 1,
        N1 is N - 1, fib(N1,F1),
        N2 is N - 2, fib(N2,F2),
        F is F1 + F2.

-------------------------------------------------------------------------------

grover@brahmand.Eng.Sun.COM (Vinod Grover) (06/14/90)

In article <37011@ucbvax.BERKELEY.EDU> vanroy@pisces (Peter Van Roy) writes:
>Please do not jump to conclusions.  The Prolog implementation does *NOT* save
>intermediate results from tak(a,b,c).  It is doing *ALL* recursive calls.
>(I wish the compiler were intelligent enough to do what you say it does :-).
>The speed is due to straightforward compiler optimization.  By the way, the

Could you tell us what optimization is being performed that is lacking in C?

Thanks

Vinod Grover

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (06/14/90)

In article <1990Jun12.212525.25551@iwarp.intel.com>, merlyn@iwarp.intel.com (Randal Schwartz) writes:
> In article <36986@ucbvax.BERKELEY.EDU>, vanroy@pisces writes:
> | Benchmark	Prolog  	C (no opt.)	C (opt. 4)
> | tak(24,16,8)    1.2		2.1		1.6
> Bogus.
> The Prolog version most likely saves results from tak(a,b,c) so that
> they can do a quick lookup the next time around.

Now *that* claim is bogus.  It was specifically stated that this was
a *Prolog* implementation, which means that it *mustn't* cache results.
(If van Roy had meant that something like extension tables was involved,
no doubt he would have said so.)

The explanation is much much simpler.  Everything in Prolog involves
procedure calls, so a good Prolog implementation has had a _lot_ of
thought put into making procedure calls _really_ cheap.  Procedure calls
are important in C, but not important enough for VAX implementations
to bypass the 'CALLS' instruction and use 'JSR'.

This particular benchmark result tells us precisely two things:
(1) _one_ Prolog system on that machine has cheaper procedure calls
    than _one_ C compiler.  Good work!  But not very surprising given
    that C compiler writers tend to worry more about other things.
(2) that particular Prolog system handles integer arithmetic better
    than a lot of other Prolog systems.

It doesn't tell us how C and Prolog would compare on realistic applications,
and there the pertinent question tends to be how easy it is to handle a
good data structure.  I've an example where I speeded a program up quite
substantially by translating from Fortran to Prolog, the speedup really
came from using a much better data structure, but it was much easier to
use that data structure in Prolog.  With all due respect to Perl (The
TECO of the 90s) this is one respect where Perl 1.x was not as good as
it might have been (miles better than awk, though).
-- 
"A 7th class of programs, correct in every way, is believed to exist by a
few computer scientists.  However, no example could be found to include here."

richard@aiai.ed.ac.uk (Richard Tobin) (06/14/90)

In article <1990Jun12.212525.25551@iwarp.intel.com> merlyn@iwarp.intel.com (Randal Schwartz) writes:
>Bogus.

>The prolog version most likely saves results from tak(a,b,c) so that
>they can do a quick lookup the next time around.  

I assume Peter van Roy would have told us if this were the case.

Peter, could you post the generated code?

-- Richard



-- 
Richard Tobin,                       JANET: R.Tobin@uk.ac.ed             
AI Applications Institute,           ARPA:  R.Tobin%uk.ac.ed@nsfnet-relay.ac.uk
Edinburgh University.                UUCP:  ...!ukc!ed.ac.uk!R.Tobin

sehr@sp20.csrd.uiuc.edu (David C. Sehr) (06/14/90)

Pardon my impertinence, Dr. O'Keefe, but how does goal caching make
an implementation no longer ``a Prolog implementation''?  It seems to
me that whatever implementation observes the externally observable
language semantics is ``a Prolog implementation''.  If the approach you
seem to suggest is taken, then any FORTRAN compiler that does loop 
invariant floating is not a FORTRAN implementation.  That surely is not
the approach taken by the FORTRAN community.

Since the Tak benchmark is side effect free with only one solution, to all 
external appearances the execution is the same with or without goal caching.
Is there some subtle semantic point I've missed somewhere? I know that 
tracing/debugging would be different, but that doesn't seem to be at issue
here.

Now the fact that there's not enough redundancy in Tak to account for 
that much of a speedup is more proof that there's no goal caching going
on in the example at hand.  Of course, the fibonacci example cited in
a subsequent posting is much more suspect for this, although I'll take
Dr. van Roy at his word on both examples.

David

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (06/15/90)

In article <1990Jun14.145424.29053@csrd.uiuc.edu>, sehr@sp20.csrd.uiuc.edu (David C. Sehr) writes:
> Pardon my impertinence, Dr. O'Keefe, but how does goal caching make
> an implementation no longer ``a Prolog implementation''?  It seems to
> me that whatever implementation observes the externally observable
				  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> language semantics is ``a Prolog implementation''.

Precisely.  Goal caching does *NOT* preserve the externally observable
language semantics.  (That's why extension tables in SB Prolog have to
be declared as such.)
-- 
"A 7th class of programs, correct in every way, is believed to exist by a
few computer scientists.  However, no example could be found to include here."

sehr@sp20.csrd.uiuc.edu (David C. Sehr) (06/15/90)

Well, aside from the miserable english in my first posting, maybe I didn't
make my point clear.  I agree that memo functions don't preserve the 
semantics of the language in the general case.  What I still fail to see is 
that it doesn't preserve the semantics in this case (i.e. either tak or fib).
The overwhelming majority of ``optimizing'' transformations don't preserve 
language semantics in the most general case (e.g. invariant hoisting fails 
with zero-trip loops, constant folding fails with finite precision arithmetic,
do to doall parallelization fails in the presence of pi-blocks, etc.).  A part
of all optimization is determining the legality of the transformation before 
it is applied.  Is there something obvious about this specific problem that 
I've missed?

David

jeff@aiai.ed.ac.uk (Jeff Dalton) (06/15/90)

What was missing form the original article was an explanation of why
the Prolog result was faster.  Without that, we can't learn all that
much from it.  Two examples may serve to illustrate this point.

I have done Lisp vs C comparisons for TAK in the past.  Some results
showed Franz Lisp being faster than C.  Lisp arithmetic can be as fast
as C, but it would be unusual to find that it was faster.  However,
this was on a VAX, and a little thought suggested that what was really
being shown was that Franz implemented procedure calls more efficiently.
Indeed, "local" functions in Franz are called by JSB (or whatever the
VAX instruction is) while the C call was using CALLS.

Of course, C could use JSB, so my result did not show that the Franz
Lisp language was faster than C, just that certain implementations on
certain machines could produce certain results.

A more fun example is that Kyoto Common Lisp was faster than C
on a Sun3.  This cries out for explanation, because KCL compiles
Lisp by translating it into C.

So this Prolog vs C result doesn't show that this Prolog is faster
than that C on arithmetic.  It might be procedure calls or something
else that accounts for it.  I at least would be interested in knowing
what and whether C could do it too.  Perhaps Richard O'Keefe is
right to suggest that it's "_really_ cheap" procedure calls.

However, before going on I think we ought to bear in mind that the
original article said:

  Disclaimer: this benchmark has a particularly easy translation to C.
  The result does not necessarily hold for other programs.  But don't
  you believe it anymore that Prolog is slow!

Ok, maybe Prolog is still slow at everything else; but it's clear
that no one was really trying to claim otherwise.

Jeff Dalton,                      JANET: J.Dalton@uk.ac.ed             
AI Applications Institute,        ARPA:  J.Dalton%uk.ac.ed@nsfnet-relay.ac.uk
Edinburgh University.             UUCP:  ...!ukc!ed.ac.uk!J.Dalton

tim@proton.amd.com (Tim Olson) (06/16/90)

In article <22078@megaron.cs.arizona.edu> debray@cs.arizona.edu (Saumya K. Debray) writes:
| merlyn@iwarp.intel.com (Randal Schwartz) writes:
| > The prolog version most likely saves results from tak(a,b,c) so that
| > they can do a quick lookup the next time around.
| 
| Look -- a program either does something, or it doesn't [*].  To say that
| it "most likely" does something is basically the same as saying 
| "I {didn't bother to | couldn't} understand the code, but here's a
| wild guess", except that it's not quite as honest.
| 
| For the benchmark in question, Van Roy posted both the C and the Prolog
| sources, so it's easy to verify whether or not there's any table lookups
| going on.  There isn't.

No, there is no *explicit* table lookup in the prolog version of the
benchmark, but that doesn't mean that there is no table lookup of
cached results going on in the depths of the Prolog implementation.
It would be a valid thing to do in a Prolog implementation to speed up
unification.

It is quite easy to follow a C program and see what is going on, since
there is a good correspondance between the language and the hardware.
For Prolog, however, (like Smalltalk, ML, etc.) there is usually
quite a lot hidden in the implementation that prevents a direct
comparison.

The argument here is that "unification result cacheing" changes the
benchmark significantly -- entire call trees are pruned from the
execution.


	-- Tim Olson
	Advanced Micro Devices
	(tim@amd.com)