[comp.sys.m88k] DG's use of m88k

despoix@imag.imag.fr (Frederic DESPOIX) (02/20/90)

Newsgroups: comp.sys.m88k
Subject: DG's use of m88k
Expires: 
References: 
Sender: 
Reply-To: despoix@imag.fr (Frederic DESPOIX)
Followup-To: 
Distribution: world
Organization: IMAG-LGI, University of Grenoble, France
Keywords: 

	I am sure most of you have read Alan LOVEJOY's (alan@oz.nm.paradyne.com)
latest release of SPEC benchmarks (see comp.arch, Latest SPECmarks). Since I
will soon be using a DG 4000, I would like DG specialists to give me reasons
why DG 5010 server has so BAD results compared with MIPS 3240 for example, and
with other m88k based machines, like Motorola Delta 8864SP.
One reason someone gave us here was the VERY SMALL cache (2x16Ko). I must admit
I can't understand why not more. Any more reasons ?

Allo, rtp ?

fred.
-------------------------------------------------------------------------------
From: alan@oz.nm.paradyne.com (Alan Lovejoy)
Newsgroups: comp.arch
Subject: Latest SPECmarks
Message-ID: <7377@pdn.paradyne.com>
Date: 12 Feb 90 16:23:51 GMT
References: <8859@portia.Stanford.EDU> <5190@convex.convex.com> <1850@cbnewsi.ATT.COM> <2938@oakhill.UUCP> <3085@rtmvax.UUCP>
Organization: AT&T Paradyne, Largo, Florida

In article <3085@rtmvax.UUCP> wbeebe@rtmvax.UUCP (Bill Beebe) writes:
>In article <2938@oakhill.UUCP> davet@oakhill.UUCP (David Trissel) writes:
>Something else that's interesting. In the February 7th Microprocessor
>Report, page 4, under new SPEC numbers, a Moto system with a 33 Mhz 88K came
>up with a 17.8 SPECmark. Congratulations. However, the article goes on to
>note that the 88K SPECmark was only 1% over the SPARC's 17.6 SPECmark (as
>well as the MIPS). I would be most interested to see SPECmarks for the
>Heurikon board (or any other system) running the 040 at 25 Mhz or even 33
>Mhz.

                        SPECmark gcc   espresso spice doduc nasa7 ll 
Moto Delta Model 8612   17.8     18.3  23.0     14.8  12.2  17.5  23.9 
MIPS M/2000             17.6     19.1  18.3     12.1  17.6  18.4  23.8
Sparcserver 490         17.6     21.4  16.5     16.4  14.0  19.9  19.5
MIPS RC3260             17.3     18.5  18.0     11.9  17.3  18.2  23.3
Solbourne 5/801         16.3     19.5  16.3     14.9  11.6  16.1  18.2
MIPS RC3240             16.0     15.5  17.7     12.1  15.9  18.1  20.4
Moto Delta 8864SP       
 Departmental Comp.     15.2     17.5  19.4     12.5  10.1  15.2  20.7
HP Appollo DN 10010     13.9     12.5  12.9     11.8  23.0  20.4  6.7
DG AV 6200 Server       12.7     13.4  17.1     11.0  9.2   11.9  17.5
Moto Delta 8864SP       12.2     14.0  15.5     10.0  8.1   12.2  16.5 
Sparcstation 330        11.8     13.8  11.6     11.4  9.5   14.0  11.2
DECSystem 5400          11.3     10.9  13.4     8.9   9.7   12.6  13.3
DG AV 5010 Server       10.1     10.9  13.3     8.7   7.1   9.5   13.8
DG AV 310 WKS           9.7      9.9   13.1     8.3   6.9   9.3   13.5
DEC VAX 6000 M450       9.2      5.1   6.5      6.2   6.9   29.1  7.2
Sparcstation 1          8.4      10.7  8.9      8.2   5.0   10.2  9.0
DEC VAX 6000 M410       6.8      5.1   6.5      6.2   6.9   8.2   7.2


			SPECmark eqntolt matrix300 fpppp tomcatv  CPU/MHz
Moto Delta Model 8612   17.8     20.7    21.5      15.3  14.9     88000/33MHz 
MIPS M/2000             17.6     18.3    13.3      20.4  17.7     R3000/25MHz
Sparcserver 490         17.6     17.6    22.5      18.8  12.3     SPARC/33MHz
MIPS RC3260             17.3     17.9    13.1      20.0  17.3     R3000/25MHz 
Solbourne 5/801         16.3     17.2    22.6      17.9  11.8     SPARC/33MHz
MIPS RC3240             16.0     17.1    13.8      17.8  13.9     R3000/??      
Moto Delta 8864SP       
 Departmental Comp.     15.2     16.0    18.4      14.7  11.6     88000/??
HP Appollo DN 10010     13.9     7.8     9.4       31.4  19.9     PRISM/??
DG AV 6200 Server       12.7     13.6    14.1      10.9  11.0     88000/??
Moto Delta 8864SP       12.2     12.8    14.7      11.7  9.3      88000/?? 
Sparcstation 330        11.8     12.6    14.7      13.1  8.2      SPARC/??
DECSystem 5400          11.3     12.8    10.1      12.5  10.1     R?000/??
DG AV 5010 Server       10.1     10.7    11.1      8.7   8.9      88000/??
DG AV 310 WKS           9.7      10.5    10.9      8.3   8.3      88000/??
DEC VAX 6000 M450       9.2      6.7     13.3      7.5   20.9     VAX/??  
Sparcstation 1          8.4      9.7     11.0      7.8   6.0      SPARC/??
DEC VAX 6000 M410       6.8      6.7     6.5       7.5   7.4      VAX/??


Benchmark numbers are relative to VAX 11/780 = 1.0, so higher numbers are
better.  The programs gcc, espresso, spice, doduc, nasa7, ll, eqntolt,
matrix300, fpppp and tomcatv are real programs, not synthetic or toy codes.
The SPECmark is the geometric mean of the ten benchmark programs (the Nth
root of the product of the membmers of a list with 10 members). 

An examination of the numbers would lead one to conclude  that the benchmarks
are heavily affected by the system as a whole, not just the processor.  Such
things as cache size/speed, memory speed and disk access speed must contribute
significantly to the results.

Also, the 88k systems typically have 1/4 as much cache as some of the competing
Rx000 and SPARC systems (32k versus 128k), even though the 88k is quite
capable of supporting 128k of cache.

If someone could supply more information about each benchmark, and each
system (cache size, CPU clock rate, etc), that might prove enlightening.

____"Congress shall have the power to prohibit speech offensive to Congress"____
Alan Lovejoy; alan@pdn; 813-530-2211; AT&T Paradyne: 8550 Ulmerton, Largo, FL.
Disclaimer: I do not speak for AT&T Paradyne.  They do not speak for me. 
Mottos:  << Many are cold, but few are frozen. >>     << Frigido, ergo sum. >>
___________________________________________________________________________
 DESPOIX Frederic,                              despoix@imag.fr
 LGI-DU, B.P. 53x                               despoix@imag
 38041 GRENOBLE cedex, FRANCE                   uunet.uu.net!imag!despoix
 Tel : (33)76.51.46.00 ext. 51.43           /------------------------------
 Fax : (33)76.44.66.75                      | NB : Usual disclaimer applies

tom@ssd.csd.harris.com (Tom Horsley) (02/22/90)

In article <7521@imag.imag.fr> despoix@imag.imag.fr (Frederic DESPOIX) writes:
>  Benchmark numbers are relative to VAX 11/780 = 1.0, so higher numbers are
>  better.  The programs gcc, espresso, spice, doduc, nasa7, ll, eqntolt,
>  matrix300, fpppp and tomcatv are real programs, not synthetic or toy codes.
>  The SPECmark is the geometric mean of the ten benchmark programs (the Nth
>  root of the product of the membmers of a list with 10 members). 

If you believe that all these benchmarks are real programs and not toys you
have been reading SPEC literature, not the benchmark source.  Matrix300
spends 99.99% of its cycles in a single line of code, even Whetstone is a
better benchmark than this. Espresso is not much better, spending 90% of its
time in the compare routine it passes to qsort().

>  If someone could supply more information about each benchmark, and each
>  system (cache size, CPU clock rate, etc), that might prove enlightening.

There are lots of different clock rates/cache sizes/compilers out there.
This is the information you are supposed to pay hundreds of dollars to get
from SPEC.

One thing that is worth noting is that ALL of the floating point benchmarks
are strictly double precision, and double precision is somewhat slow on the
88100 architecture implementation because internal paths are only 32 bits
wide.

There are quite a lot of applications areas that do not need double
precision (graphics processing is often a good example, when you only need
to map images onto a screen 1000 pixels wide, double precision is overkill).
The SPEC benchmark suite needs to have at least one or two single precision
benchmarks added to give it a better balance. (I am quite aware that there
are also applications requiring double precision, I just want to see some
single precision numbers in SPEC, don't tell me I claimed no one need double
precision).
--
=====================================================================
domain: tahorsley@ssd.csd.harris.com  USMail: Tom Horsley
  uucp: ...!novavax!hcx1!tahorsley            511 Kingbird Circle
      or  ...!uunet!hcx1!tahorsley            Delray Beach, FL  33444
======================== Aging: Just say no! ========================

tom@ssd.csd.harris.com (Tom Horsley) (02/22/90)

In article <TOM.90Feb21110800@hcx2.ssd.csd.harris.com> tom@ssd.csd.harris.com (Tom Horsley) writes:
>                              Espresso is not much better, spending 90% of its
                               ^^^^^^^^
>  time in the compare routine it passes to qsort().

Stop. Don't send me mail, I know. Espresso doesn't call qsort(). I meant to
write that *eqntott* spends 90% of its time in the compare routine.

While I am babbling, I might as well throw in a complaint about the xlisp
benchmark as well. This would be a much better benchmark (at least for
evaluating the compiler used to build xlisp) if the lisp program the
benchmark executes exercised more of the xlisp interpreter. Xlisp is indeed
a real program, but the benchmark only executes small pieces of it because
the input data is a toy lisp program.
--
=====================================================================
domain: tahorsley@ssd.csd.harris.com  USMail: Tom Horsley
  uucp: ...!novavax!hcx1!tahorsley            511 Kingbird Circle
      or  ...!uunet!hcx1!tahorsley            Delray Beach, FL  33444
======================== Aging: Just say no! ========================

mash@mips.COM (John Mashey) (02/27/90)

In article <TOM.90Feb21110800@hcx2.ssd.csd.harris.com> tom@ssd.csd.harris.com (Tom Horsley) writes:
>In article <7521@imag.imag.fr> despoix@imag.imag.fr (Frederic DESPOIX) writes:
>>  Benchmark numbers are relative to VAX 11/780 = 1.0, so higher numbers are
>>  better.  The programs gcc, espresso, spice, doduc, nasa7, ll, eqntolt,
>>  matrix300, fpppp and tomcatv are real programs, not synthetic or toy codes.
>>  The SPECmark is the geometric mean of the ten benchmark programs (the Nth
>>  root of the product of the membmers of a list with 10 members). 

>If you believe that all these benchmarks are real programs and not toys you
>have been reading SPEC literature, not the benchmark source.  Matrix300
>spends 99.99% of its cycles in a single line of code, even Whetstone is a
>better benchmark than this. Espresso is not much better, spending 90% of its
>time in the compare routine it passes to qsort().


>>  If someone could supply more information about each benchmark, and each
>>  system (cache size, CPU clock rate, etc), that might prove enlightening.
>
>There are lots of different clock rates/cache sizes/compilers out there.
>This is the information you are supposed to pay hundreds of dollars to get
>from SPEC.
>
>One thing that is worth noting is that ALL of the floating point benchmarks
>are strictly double precision, and double precision is somewhat slow on the
>88100 architecture implementation because internal paths are only 32 bits
>wide.
>
>There are quite a lot of applications areas that do not need double
>precision (graphics processing is often a good example, when you only need
>to map images onto a screen 1000 pixels wide, double precision is overkill).
>The SPEC benchmark suite needs to have at least one or two single precision
>benchmarks added to give it a better balance. (I am quite aware that there
>are also applications requiring double precision, I just want to see some
>single precision numbers in SPEC, don't tell me I claimed no one need double
>precision).

All these criticisms are well-taken, althought the programs are either
real ones, or derived kernels from real ones where the bulk of the time is
spent in small loops (that's the way many scientific codes really are).

There is no doubt that some of the ones that are there will drop out sooner
or later as we find better ones; the point of this exercise was to
start getting some real data out into the open that people could pound away at,
get a good process in place, and start making progress.

With regard to single-precision: we'd love to have some, and I suspect
there are some on the list of candidates; nobody could find one fast enough
for the first round that could make it thru all the rest of the criteria,
whereas we had 64-bit ones in plenty. We welcome more people joining and
doing work.....
-- 
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	{ames,decwrl,prls,pyramid}!mips!mash  OR  mash@mips.com
DDD:  	408-991-0253 or 408-720-1700, x253
USPS: 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086

rfg@ics.uci.edu (Ronald Guilmette) (02/27/90)

In article <36461@mips.mips.COM> mash@mips.COM (John Mashey) writes:
>
>All these criticisms are well-taken, althought the programs are either
>real ones, or derived kernels from real ones where the bulk of the time is
>spent in small loops (that's the way many scientific codes really are).

If individual program are measuring for "scientific" performance, they
should probably be labeled as such.

Now that I got thah off my chest, I'd like to ask a question about the
GCC member of the SPEC suite.

It is my understanding that GCC spends a great deal of its time doing
code generation and/or optimization.  Obviously, the actual amount of
time spent doing these things can (potentially) vary a great deal depending
upon the target machine that you are having GCC generate code for.  Therefore,
it would seem that in order to obtain a really "fair" apples-to-apples
comparison of performance using GCC as a test case, you would have to run
GCC on host machine X and have it compile some code for target machine Z
and then run GCC hosted on machine Y and again have it produce code targeted
for the same target machine (Z) as in previous tests.

I'd just like to know if this is in fact what happens when the GCC "benchmark"
is evaluated as part of SPEC benchmarking.  Is that how it is done, or
do the benchmarkers run GCC hosted on X and targeted for X and then run
GCC hosted on Y and targeted for Y?  If so, that seems to be a possible source
of misleading information.

Of course, some people might say that the cost of compiling for a particular
machine *should* be factored in as a part of the overall "performance"
of the machine itself.  Perhaps that's true, but again it may be a question
of properly "labeling" the results so that people who read the SPEC numbers
fully understand that compilation speed FOR THAT MACHINE was factored in.
(People who do no software development will not care about this factor
at all, but conversly, professional software developers might care about
that particular performance factor above all else).

// Ron Guilmette (rfg@ics.uci.edu)
// C++ Entomologist
// Motto:  If it sticks, force it.  If it breaks, it needed replacing anyway.

mash@mips.COM (John Mashey) (02/27/90)

In article <25EA4069.17611@paris.ics.uci.edu> rfg@ics.uci.edu (Ronald Guilmette) writes:
>In article <36461@mips.mips.COM> mash@mips.COM (John Mashey) writes:

>If individual program are measuring for "scientific" performance, they
>should probably be labeled as such.
In the SPEC newsletter, each of the benchmarks is described in at least moderate
detail, and the scientific ones are certainly identifiable.

>Now that I got thah off my chest, I'd like to ask a question about the
>GCC member of the SPEC suite.
>
>It is my understanding that GCC spends a great deal of its time doing
>code generation and/or optimization.  Obviously, the actual amount of
>time spent doing these things can (potentially) vary a great deal depending
>upon the target machine that you are having GCC generate code for.  Therefore,
>it would seem that in order to obtain a really "fair" apples-to-apples
>comparison of performance using GCC as a test case, you would have to run
>GCC on host machine X and have it compile some code for target machine Z
>and then run GCC hosted on machine Y and again have it produce code targeted
>for the same target machine (Z) as in previous tests.
That's what it does: the target is always the same.

>Of course, some people might say that the cost of compiling for a particular
>machine *should* be factored in as a part of the overall "performance"
>of the machine itself.  Perhaps that's true, but again it may be a question
>of properly "labeling" the results so that people who read the SPEC numbers
>fully understand that compilation speed FOR THAT MACHINE was factored in.
>(People who do no software development will not care about this factor
>at all, but conversly, professional software developers might care about
>that particular performance factor above all else).
There may sometime be such a benchmark, but the gcc one isn't it.
-- 
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	{ames,decwrl,prls,pyramid}!mips!mash  OR  mash@mips.com
DDD:  	408-991-0253 or 408-720-1700, x253
USPS: 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086