[comp.arch] Workstations for Lisp

ram@wb1.cs.cmu.edu (Rob MacLachlan) (06/11/89)

I am a Lisp systems programmer for the CMU Common Lisp project.  I am
currently working on a new Common Lisp compiler.  I have several comments to
make about evaluating a system for Lisp, but I generally won't come down on
one side or the other of the SPARC/MIPS question, since I really don't know
the important system details.

The first thing I will say, and I believe the most important is *consider
the non-processor issues*.  I say this up front, since I suspect this sort
of thing will get short shrift in the following flamage.  The biggest
determinant of performance of many Lisp systems is memory system performance:
 -- How much memory can you put in?  If not 16 meg+, forget it for serious
    work.
 -- How much memory can you afford to put in?
 -- How fast are the disks?  Even with lots of memory you will page.
 -- How big is the cache?


As to the specific RISC v.s. CISC question for Lisp, I strongly favor a RISC
approach.  In addition to the usual RISC arguments, there are several Lisp
specific factor that favor RISC:
 -- Complex addressing modes are difficult or impossible to use due to the
    way Lisp data structures are laid out (with indirections everywhere.)
 -- Complex call instructions not designed for Lisp are rarely useful for
    Lisp.  Generally Lisp call sequences for CISCs ignore hairy call features
    and use multiple simple instructions.

The general idea is that the complex features of widely available CISC
architectures aren't designed to support Lisp, and therefore are rarely
useful for Lisp.  You end up using the subset of the instruction set that
resembles a RISC.

As to SPARC v.s. MIPS:

Pro SPARC:
 -- Has special tagged +/- instructions for short integer arithmetic.  This
    can speed up these operations in the absence of appropriate FIXNUM
    declarations.  Integer arithmetic is important in many (but not all)
    symbolic applications for vector indices, loop counter counters, etc.
    Even if your program doesn't uses integers much, the underlying Lisp
    run-time system uses them for things such as I/O, hashtables, etc.  Of
    course, the system code *should* have the right declarations, so tagged
    operations shouldn't have a big affect on system performance.
 -- Has register windows, which *potentially* are a win for Lisp
    call/return, since the main competing technology (global register
    allocation) works poorly with Lisp's run-time function linkage.
    In practice, I don't know how well SPARC's windows mesh with Lisp
    call-sequence requirements.

Pro MIPS:
 -- Has lots of registers, so:
    - It is easier to sucessfully allocate locals in registers.
    - Important globals of the run-time system can be kept in registers.  A
      Lisp system may have three stack pointers, several heap pointers, a
      couple registers for frame pointers, etc.

I am primarily a compiler writer, so perhaps it is not surprising that I
somewhat favor MIPS: it gives me more room (registers) to play with, without
preempting any design decisions.  Also, I am interested in global register
allocation algorithms and "block compilation" of programs (resolving
function references at compile time).

On the other hand, if I were desiging an architecture for Lisp, I would
certainly make checked fixnum arithmetic "free", and I would also seriously
consider using register windows.  Note that I am not convinced register
windows are necassarily a good thing for Lisp.  Studies of non-Lisp program
performance are not very relevant, since Lisp functions are clearly
statistically different in some ways:
 -- Functions tend to be smaller.
 -- Recursion is more common.

I suspect that small or variable-sized windows would be a win for Lisp.

  Rob
--

dpm@cs.cmu.edu (David Maynard) (06/12/89)

ram@wb1.cs.cmu.edu (Rob MacLachlan) writes:
>  -- How fast are the disks?  Even with lots of memory you will page.

I don't have direct experience, but recent Sun-Spots (comp.sys.sun)
articles have indicated that the SparcStation is significantly better
at disk I/O than the DECStation 3100.  (Gee, could that DMA be paying
off?)

However, these same articles indicate that the 3100 can be
significantly better for number crunching.  I suspect that part of the
difference here is the "quality" of the math libraries.  It seems to
take more work to make a Sun run math codes faster.  I'm sure the 3100
does have a raw speed advantage, I'm just not sure it is as great as
some people have reported.

I believe that more AI software is currently available for the Suns.
Sun has had longer to build an AI base.  It is likely that DEC will
try to close this gap though.

 ---
 David P. Maynard (dpm@cs.cmu.edu)
 Dept. of Electrical and Computer Engineering, Carnegie Mellon University
 ---
 These are my opinions.  I haven't asked CMU what our official opinion is.

khb@chiba.Sun.COM (chiba) (06/13/89)

In article <EYYf=ky00jcqNYcW4L@cs.cmu.edu> dpm@cs.cmu.edu (David Maynard) writes:
>ram@wb1.cs.cmu.edu (Rob MacLachlan) writes:
>>  -- How fast are the disks?  Even with lots of memory you will page.
>
>I don't have direct experience, but recent Sun-Spots (comp.sys.sun)
>articles have indicated that the SparcStation is significantly better
>at disk I/O than the DECStation 3100.  (Gee, could that DMA be paying
>off?)
>
>However, these same articles indicate that the 3100 can be
>significantly better for number crunching.  I suspect that part of the
>difference here is the "quality" of the math libraries.  It seems to
>take more work to make a Sun run math codes faster.  I'm sure the 3100
>does have a raw speed advantage, I'm just not sure it is as great as
>some people have reported.

Math performance is partially due to design choices in the libraries.
One of the key questions (for C) is which standard ... ANSI, SVID,
K&R, Posix ... all require slightly different answers to certain
cases. Sun's arithmetic group is (perhaps) too concerned with getting
the correct answer ... seymour has proved that folks prefer fast to
accurate :>

>
>I believe that more AI software is currently available for the Suns.
>Sun has had longer to build an AI base.  It is likely that DEC will
>try to close this gap though.

When there is a DEC sponsored LISP, we can try to determine real
performance figures. I am not very AI oriented, but in my limited AI
experience IO dominates in most "real" systems (as pointed out above)
so I expect the SS330 to perform better on some reasonable set of
application sized codes.

After IO, the next performance "feature" of AI codes is probably
memory subsystem speed. The current MIPS vs. SPARC implementation key
difference is cycles for ld/sto ... MIPS is faster; but this results
in many stalls on the 3100 (vs. say, the MIPS M2000 with its 4-deep
buffered write thru cache) ... the SS330 has enough buffering that
stores tend not to lock up ... as best I can remember, the DS3100 has
write thru, no buffering ... so loads should be faster, stores slower.

After those, the issues of tagged arithmetic, and register windows vs.
shared pool probably kick in.

My experience is that IO vastly dominates these second order effects,
on "real AI applications".

> ---
> These are my opinions.  I haven't asked CMU what our official opinion is.

ditto. I haven't even asked sun what our opinon is.

Keith H. Bierman      |*My thoughts are my own. Only my work belongs to Sun*
It's Not My Fault     |	Marketing Technical Specialist    ! kbierman@sun.com
I Voted for Bill &    |   Languages and Performance Tools. 
Opus  (* strange as it may seem, I do more engineering now     *)

mash@mips.COM (John Mashey) (06/13/89)

In article <109577@sun.Eng.Sun.COM> khb@sun.UUCP (chiba) writes:

>After IO, the next performance "feature" of AI codes is probably
>memory subsystem speed. The current MIPS vs. SPARC implementation key
>difference is cycles for ld/sto ... MIPS is faster; but this results
>in many stalls on the 3100 (vs. say, the MIPS M2000 with its 4-deep
>buffered write thru cache) ... the SS330 has enough buffering that
>stores tend not to lock up ... as best I can remember, the DS3100 has
>write thru, no buffering ... so loads should be faster, stores slower.

Most R2000 or R3000 systems, DS3100 included, use 4-deep write-buffers,
often built with R2020 Write Buffers.
Unless I misread the Sun info (which I don't have handy), both SS1 and
SS330 use write-thru caches with a 1-deep write buffer: please correct
if this is wrong.
-- 
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	{ames,decwrl,prls,pyramid}!mips!mash  OR  mash@mips.com
DDD:  	408-991-0253 or 408-720-1700, x253
USPS: 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086

khb@chiba.Sun.COM (Keith Bierman - SPD Languages Marketing -- MTS) (06/14/89)

In article <21525@winchester.mips.COM> mash@mips.COM (John Mashey) writes:

>
>Most R2000 or R3000 systems, DS3100 included, use 4-deep write-buffers,
>often built with R2020 Write Buffers.

I stand corrected. Now all I have to do is understand why our 3100
performance does not scale trivially to our M2000 .... suggestions ?
Memory system had seemed so attractive as an explaination ....

>Unless I misread the Sun info (which I don't have handy), both SS1 and
>SS330 use write-thru caches with a 1-deep write buffer: please correct
>if this is wrong.

Reverse engineering (i.e. looking at the performance on the same
codes) one can see the SS1 stalling much more often than stingray.
Clearly my memory is faulty (or I would not have misquoted the
MIPS/DEC lit :>).. but my recollection is that 4/330 has a double word
of buffering.

Sad to say, I must confess to not having the lit handy. I will check
and if no one beats me to the correction, I will post one (if needed). :>

cheers

Keith H. Bierman      |*My thoughts are my own. Only my work belongs to Sun*
It's Not My Fault     |	Marketing Technical Specialist    ! kbierman@sun.com
I Voted for Bill &    |   Languages and Performance Tools. 
Opus  (* strange as it may seem, I do more engineering now     *)

mash@mips.COM (John Mashey) (06/14/89)

In article <109849@sun.Eng.Sun.COM> khb@sun.UUCP (Keith Bierman - SPD Languages Marketing -- MTS) writes:
>In article <21525@winchester.mips.COM> mash@mips.COM (John Mashey) writes:
>>Most R2000 or R3000 systems, DS3100 included, use 4-deep write-buffers,
>>often built with R2020 Write Buffers.

>I stand corrected. Now all I have to do is understand why our 3100
>performance does not scale trivially to our M2000 .... suggestions ?
>Memory system had seemed so attractive as an explaination ....
The memory systems are very different.  R2000-based machines use the
simplest-possible cache, i.e., write-thru with 1-word refills on miss.
and cache-word invalidates on partial-word writes.
R3000s are much more complicated, as they can do anywhere from 1 to 32
words/block burst refill, instruction streaming, direct drive of the 
cache rams, optional read-modify-write instead of invalidate for
partial-word writes; they typically have memory systems with more
interleaving, page-mode DRAMs, etc, etc.

>Reverse engineering (i.e. looking at the performance on the same
>codes) one can see the SS1 stalling much more often than stingray.
>Clearly my memory is faulty (or I would not have misquoted the
>MIPS/DEC lit :>).. but my recollection is that 4/330 has a double word
>of buffering.
Oops, I recall seeing that also, but I think I was thinking it was
1 doubleword, rather than 2 32-bit words. Now, I don't know which.
-- 
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	{ames,decwrl,prls,pyramid}!mips!mash  OR  mash@mips.com
DDD:  	408-991-0253 or 408-720-1700, x253
USPS: 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086

khb@chiba.Sun.COM (chiba) (06/14/89)

In article <21641@winchester.mips.COM> mash@mips.COM (John Mashey) writes:

>>MIPS/DEC lit :>).. but my recollection is that 4/330 has a double word
>>of buffering.
>Oops, I recall seeing that also, but I think I was thinking it was
>1 doubleword, rather than 2 32-bit words. Now, I don't know which.
>-- 

Works either way. The SPARC store-double instruction "fits", as do two
single stores.

Keith H. Bierman      |*My thoughts are my own. Only my work belongs to Sun*
It's Not My Fault     |	Marketing Technical Specialist    ! kbierman@sun.com
I Voted for Bill &    |   Languages and Performance Tools. 
Opus  (* strange as it may seem, I do more engineering now     *)