MLWLG@CUNYVM.CUNY.EDU (11/23/88)
DEAR READER, I WOULD LIKE INFORMATION ON MEASURING THE LIPS OF A SYSTEM, I.E., LOGICAL INFERENCES PER SECOND. FOR EXAMPLE I HEAR THAT A PSI COMPUTER CAN RUN, SAY, AT 100K LIPS. IS THERE SOME STANDARD PROGRAM (I.E., PROLOG) THAT IS USED? ALSO, HOW DO LIPS COMPARE TO FLOPS? THANK YOU IN ADVANCE. SINCERELY, LARRY.
ok@quintus.uucp (Richard A. O'Keefe) (11/26/88)
In article <1740MLWLG@CUNYVM> MLWLG@CUNYVM.CUNY.EDU writes: >I WOULD LIKE INFORMATION ON MEASURING THE LIPS OF A SYSTEM, I.E., >LOGICAL INFERENCES PER SECOND. FOR EXAMPLE I HEAR THAT A PSI COMPUTER >CAN RUN, SAY, AT 100K LIPS. IS THERE SOME STANDARD PROGRAM (I.E., PROLOG) >THAT IS USED? Logical Inferences Per Second is a property of a _Prolog_ implementation, not of any other sort of system, and is defined by the Naive Reverse benchmark _only_. Divide 496 by the time in seconds to reverse a 30- element list the hard way, and you get LI/s. It has been fairly widely agreed for about the last 10 years that much better benchmarks are needed. A Motorola MC68020 running at Sun-3/50 speeds can get 150kLI/s (using native code). Surprise! the benchmark fits entirely into the on-chip cache... Two popular semi-realistic benchmarks use CHAT-80 and the Berkeley PLM compiler. >ALSO, HOW DO LIPS COMPARE TO FLOPS? About like apples and eggplants. You can have a high LI/s rating and a low FlOp/s rating (e.g. a 68020 with software floating point). You can have a low LI/s rating and a high FlOp/s rating (a Cray running a poor interpreter). The LI/s rating, for a good Prolog system, depends on memory bandwidth, integer operations (like bit tests and comparisons), and branches.
reiter@babbage.harvard.edu (Ehud Reiter) (11/29/88)
In article <1740MLWLG@CUNYVM> MLWLG@CUNYVM.CUNY.EDU writes: >I WOULD LIKE INFORMATION ON MEASURING THE LIPS OF A SYSTEM, I.E., >LOGICAL INFERENCES PER SECOND From a technical point of view, "LIPS" is a term used by PROLOG types to refer to the number of unifications per second their systems can perform when running a trivial program such as naive reverse. See, for example, Sterling&Shapiro, THE ART OF PROLOG, pg 48 and 203. I must say, though, that calling such a unification a "logical inference" ranks pretty high on my list of misleading names. If what you're really interested in is evaluating how good a computer system is at performing symbolic computations, I suggest you start by looking at Gabriel's book on benchmarking LISP, which contains a lot of thoughtful analysis on what is necessary for good performance on symbolic (and numeric) programs written in LISP. Ehud Reiter reiter@harvard (ARPA,BITNET,UUCP) reiter@harvard.harvard.EDU (new ARPA)
s8504867@mqcomp.oz (John Gardner) (11/30/88)
In article <746@quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes: >In article <1740MLWLG@CUNYVM> MLWLG@CUNYVM.CUNY.EDU writes: >>THAT IS USED? > >Logical Inferences Per Second is a property of a _Prolog_ implementation, > Come off the grass. That's like saying MIPS are a property of C only. While it is nice to do benchmarks by only varying what you want to compare, it is certainly valid to calculate LIPS using any theorem prover, not just prolog. Do you think that all other theorem provers are incapable of logical inferences ? Do you think prolog is the only langauge availible for this sort of work ? ------------------------------------------------------------------------------ John Gardner ACSnet,Ean,CSnet: s8504867@mqcomp.mq.oz Computing Discipline Arpa: s8504867%mqcomp.mq.oz@UUNET.UU.NET Macquarie University 2109 Janet: s8504867%mqcomp.mq.oz@UK.AC.UKC Australia UUCP: {uunet,mcvax}!munnari!mqcomp.mq.oz!s8504867 [ Macquarie University - Where you are not just a number .. you are a letter followed by seven numbers. ]
rang@cpsin3.cps.msu.edu (Anton Rang) (12/02/88)
In article <595@mqcomp.oz>, John Gardner (s8504867@mqcomp.mq.oz) writes: >In article <746@quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes: >>In article <1740MLWLG@CUNYVM> MLWLG@CUNYVM.CUNY.EDU writes: >>>THAT IS USED? >> >>Logical Inferences Per Second is a property of a _Prolog_ implementation, >> > >Come off the grass. That's like saying MIPS are a property of C >only. While it is nice to do benchmarks by only varying what you want to >compare, it is certainly valid to calculate LIPS using any theorem prover, >not just prolog. It's not valid to calculate and compare them using any theorem prover. The term "LIPS" as it is usually used refers to the internal workings of PROLOG (unification, etc.). While it would be possible to find LIPS with any theorem prover, the results couldn't be compared against PROLOG ones. The basic problem is the definition of a "logical inference". PROLOG has one; other languages would have others. MIPS is a property of a system, not a language, because you can always look at the system and count the number of instructions it's running! You can't look and just say "Oh, that's...346 logical inferences" for any given proof, because there are a very large number of ways to handle it. Even PROLOG people don't always agree on the LIPS count for a program. The "LIPS" term is misleading, but it *is* a property of PROLOG (at least, as it is currently used). +---------------------------+------------------------+----------------------+ | Anton Rang (grad student) | "VMS Forever!" | "Do worry...be SAD!" | | Michigan State University | rang@cpswh.cps.msu.edu | | +---------------------------+------------------------+----------------------+
ok@quintus.uucp (Richard A. O'Keefe) (12/02/88)
In article <595@mqcomp.oz> s8504867@mqcomp.mq.oz (John Gardner) writes: >In article <746@quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes: >>Logical Inferences Per Second is a property of a _Prolog_ implementation, >Come off the grass. That's like saying MIPS are a property of C >only. While it is nice to do benchmarks by only varying what you want to >compare, it is certainly valid to calculate LIPS using any theorem prover, >not just prolog. Do you think that all other theorem provers are incapable >of logical inferences ? Do you think prolog is the only langauge availible >for this sort of work ? Stop knocking down straw men; you'll only get straw in your hair and then what will the neighbours think? It is no more "valid to calculate LIPS using any theorem prover" than it is to calculate Dhrystones "using any program". Remember: the meaning of a compound noun is *not* a simple composition of the meanings of the words it is made from. Is a foot-hill a hill made of feet? Is a benchmark a mark on a bench? (Not now.) Yes, other programs are theorem provers capable of drawing logical inferences. Prolog is a pretty weak theorem prover, that's _why_ it is usable as a programming language. If you compare the number of resolutions per second in a good theorem prover (say Markgraf Karl, or ITP) with the kind of LIPS rating a good Prolog should get (a) the theorem prover would look _terrible_, and (b) you would learn nothing of interest. In fact, you don't learn a whole lot comparing the LIPS rating of two Prolog systems, either. Run some tests and you can get quite a few more procedure calls per second than the LIPS rating; run some others and you can get far fewer. We should be trying to get rid of "LIPS", not trying to spread the disease. For comparing theorem provers in general, there's a book of examples from one of the Argonne crowd which might be useful for benchmarking. Cpu and wall time in seconds to solve each of those problems would be much more illuminating than a single figure which favours depth-first search.
mac3n@babbage.acc.virginia.edu (Alex Colvin) (12/02/88)
In various articles > The "LIPS" term is misleading, but it *is* a property of PROLOG (at > least, as it is currently used). it's true, it's even pretty much restricted to nai:ve reverse. think of LIPS as comparable to Dhrystones rather than MIPS. bad terminology. should've been called something like NRPS or Lhipstones.
jec@iuvax.cs.indiana.edu (James E. Conley) (12/03/88)
MIPS and LIPS have a similar problem-- that you may not be counting the same sorts of instructions or logical inferences. The VAX instruction to add two integers and the MIPS instruction really don't do the same thing and similarly a logical inference has the same sort of problem-- even if you are dealing with the same system (Prolog in this case). This gets worse if you start comparing other systems which do their inferences in a different manner (production systems like YAPS and OPS5 come to mind). As usual, benchmarking using LIPS will get you a number with no solid meaning (as with MIPS). III Usenet: iuvax!jec UUU I UUU ARPANet: jec@iuvax.cs.indiana.edu U I U Phone: (812) 855-7729 U I U U.S. Mail: James E. Conley U I U Indiana University UUUIUUU Dept. of Computer Science I 004 Lindley Hall III Bloomington, IN. 47405
jwmills@iuvax.cs.indiana.edu (Jonathan Mills) (12/03/88)
To expand a little on what Richard has said (and I agree with him) LIPS depend heavily on the ability to perform the following fragment of Warren abstract machine code (the inner loop of list concatenate - see An Abstract Prolog Instruction Set by D.H.D. Warren, SRI Tech Note 309, October 1983): conc/3: switch_on_term C1a, C1, C2, fail <variable & nil code omitted> C2: get_list A1 % conc( [ unify_variable X4 % X| unify_variable A1 % L1], get_variable X2,A2 % L2, <omit because X2 = A2> get_list A3 % [ unify_value X4 % X| unify_variable A3 % L3] ) :- execute conc/3 % conc( L1, L2, L3 ). Because the functionality of this instruction sequence could be duplicated by writing it in assembler, Pascal, C, LISP, or whatever, the "LIPS" for ANY language or machine can be obtained... and be just as meaningful as you like. Or as meaningless. LIPS bear little relation to inferences in the rest of the system for other reasons: smart Prolog compilers look for concatenate-like operations, and produce code optimized for it (so LIPS may vary between NREV and any other program by a factor of 2 to 10). And if one is willing to accept as LIPS the rate at which a Prolog can do procedure calls, unbelievably high (and non-representative) speeds can be claimed. (For example, benchmarking "a :- a", which reduces to "a: goto a" in assembly language produces a LIPS rate *approximately* equal to the MIPS rate of your CPU). And I am still embarrassed to recall a version of NREV on a DPS-88 that fit both the program and the list into cache... and ran at 2.7 MegaLIPS...and dropped to 700 KLIPS when the list was made too large for the cache. Live and learn. Theorem provers and Prologs can be rated for LIPS, but again, this has little relation to the power of the system. Proving Sam's Lemma takes ITP (an interactive theorem prover written in Pascal running on a VAX 11-780) 16,000 seconds, ALS Prolog (running on an 88000 ANGELFIRE 1) 150 seconds, and OTTER (a batch theorem prover written in C running on a VAX 8800) 29 seconds. All use the set of support strategy, and generate roughly the same number of clauses, so the number of logical inferences are similar - but this tells little about what makes the programs' execution times vary (aside from the CPU, of course). Because rewriting operations (subsumption, demodulation, paramodulation), clause selection (indexing), unification, and clause integration (asserts) all interact to determine how many clauses are generated (i.e., how many inferences are *completed*), it is not an easy task to give a single "inference rate". Indeed, it would be NAIVE to do so - LIPS are now part of the *Prolog* mythos, and should be left there, as firmly embedded as is TAK in the Lisp community. If anyone out there would undertake Prolog benchmarks analogous to Gabriel's benchmarks for Lisp, we would all benefit. At least we could attempt to justify our prejudices with better numbers! (:-) And when I have better numbers for the behavior of various theorem provers under various constraints over a wide variety of domains, I'll post them. Don't expect to see anything before 1990. (ITP is available from NAG (Numerical Algorithms Group), and OTTER is available from Argonne National Lab, try "mccune@mcs.anl.gov". I can e-mail a compiled version of Sam's Lemma (useful as a benchmark) to interested folks, but the FOL compiler itself is a ways off yet.)