[net.ai] Efficiency of LISP

sasaki@harvard.ARPA (Marty Sasaki) (03/02/85)

There was an informal experiment done in the early-mid seventies at MIT
that compared MACLISP with other programming languages. Basically a
translator was written to translate (say) FORTRAN into MACLISP. Both
versions were compiled, and the programs were run and the results
compared. The translators were simple and did no real optimizing of the
code, everything was up to the lisp compiler.

In every case the lisp versions ran faster and took up less memory. The
experiment was done on either a KA-10 or a KL-10. I remember being
amazed at the FORTRAN results. The programs being used were purely
computational ones, things like matrix handling and iterative modeling
simulations.

I don't remember much more. Could any of the principles shed further light?
-- 
			Marty Sasaki
			Havard University Science Center
			sasaki@harvard.{arpa,uucp}
			617-495-1270

macrakis@harvard.ARPA (Stavros Macrakis) (03/12/85)

> > > an informal experiment ... at MIT ... compared MACLISP with other
> > > programming languages [by] translat[ing] FORTRAN into MACLISP. ...
> > > the lisp versions ran faster and took up less memory.
> [If] both compilers were doing their best at optimizing the generated
> code, it's clear that [that] MACLISP compiler was better than [that]
> FORTRAN compiler....  This [doesn't mean] LISP is better than FORTRAN
> for numerical work, [just] that LISP may not be as bad as some
> might think.  -- 	Guy Harris {seismo,ihnp4,allegra}!rlgvax!guy

I didn't make this measurement, but I was there.  The reason the DEC
Fortran code was worse than the Maclisp code was simply that the DEC
standard calling conventions (which Maclisp didn't use) were expensive.
Otherwise, the code was very similar.  Neither compiler produced
impressive code.  There is, of course, no inherent reason that the
inline code produced for numerical calculations in Lisp (assuming
machine arithmetic, not bignums) should be much worse than that produced
for Fortran, C, Pascal, or Basic.  Some technical reasons that a good
Maclisp compiler has to do a bit worse than a good Fortran compiler are
in the Appendix (below).

The BIG concession you make to get the efficiency, of course, is
programming with fixed-type operators (and/or declared variables and, to
my taste, stepping out of the elegant beauty of the core of Lisp (not
that the semantics of Lisp numbers was ever terribly clean).  Note, by
the way, that the MacLisp system's consistency checking is rather
patchwork:  declarations are not processed at all interpretively, and
separately compiled modules are not checked for consistent declarations.
In other words, like many Lisp extensions, numeric processing is
pragmatically useful, but semantically a mess.

The other issue is whether we are talking about `Lisp' at all.  You
could equally well add such features to Snobol and claim Snobol did
efficient numerical processing.  The problem in saying that `Lisp' does
something is that if it doesn't, someone will add it and still call it
`Lisp'!  You might as well call Ada, `Algol'.  I'm all for improvement
over time, but it is unfair to pretend that Common Lisp '85 = MacLisp
'77 = Lisp 1.5.
	-s

Appendix: Some Technical Details

The two restrictions on numerical efficiency in the PDP-10 Maclisp
implementation are 1) that only 5 registers are available for numbers,
because the rest are stack pointers or garbage-collectable and 2) that
arrays can only be referenced indirectly, because they may be relocated
between any two instructions (Maclisp allows interrupts at arbitrary
points).  Consider a loop to take an inner product of two arrays, A and
B, declared as fixed length.  A good Fortran compiler would produce
machine code like the following:
  float a[len], b[len];
  register float *cura = &(a[0]), *curb = &(b[0]);
  register float sum = 0.0; register int count = len;
  while (--count) sum = *(cura++) + *(curb++);
while Maclisp is obligated to produce (at best) something like:
  float *(a,len), *(b,len);
  register int index = 0; register float sum = 0.0;
  while (index < len) sum = *(a,index) + *(a,index);
where *(a,index) has the effect of *(a[index]) but is indivisible--a is
an indirect word.

In DEC Fortran's case, its main efficiency problem seems to be that it
stashes all registers (even ones it doesn't use) over subroutine calls.
A better compiler could fix this without doing any damage to the
definition of Fortran.  I speculate that he callee-saves discipline was
chosen so that small machine-language library routines (sqrt etc.) could
be hand-tuned to save only the registers they care about while not
requiring that a table be kept of which routines clobber what registers.