[comp.arch] missing Dhrystone 2.1

ari@cunixc.columbia.edu (Ari "Juan" Shamash) (07/10/88)

[dinner time..]

Can somebody please send me the third part of Dhrystone 2.1 (part 3 of 3)..
I think that is the part that contained the pascal source for the
program.  For some reason or another, it never made it here..

Thank you.
Ari Shamash
Columbia User Services.



-- 
BITNET:  ashus@cuvma.bitnet          SNAIL:  Ari Shamash
ARPANET: ari@cunixc.cc.columbia.edu          Columbia U. CUCCA User Services
USENET:  ...!rutgers!columbia!cunixc!ari     801 Watson, 612 W115th St.
PHONE:   (212) 280-8555 (e-mail preferred)   New York, NY 10025

howardl@wb3ffv.UUCP (Howard Leadmon ) (07/12/88)

 
 Part 3 of 3 from the Dhrystone 2.1 benchmark never made here as well. I would
appreciate it if somebody would repost it for us, or atleast mail me a copy!!


-------------------------------------------------------------------------------
UUCP/SMTP : howardl@wb3ffv		|	Howard D. Leadmon
PACKET    : wb3ffv@w3itm-9		|	Fast Computer Service, Inc.
IP Address: 44.60.0.1			|	P.O. Box  171 
Telephone : (301)-335-2206		|	Chase, MD  21027-0171

schein@cbmvax.UUCP (Dan Schein GUEST) (07/12/88)

In article <788@cunixc.columbia.edu> ari@cunixc.cc.columbia.edu (Ari "Juan" Shamash) writes:
>[dinner time..]
>
>Can somebody please send me the third part of Dhrystone 2.1 (part 3 of 3)..
>I think that is the part that contained the pascal source for the
>program.  For some reason or another, it never made it here..
>
>Ari Shamash

  We never received part 1 or 3 (only 2). Could someone either repost them or
 forward them via mail please.

  Thanx!

-- 
 Dan "Sneakers" Schein	      Guest of Commodore International Ltd. and cbmvax
 2455 McKinley Ave
 West Lawn PA 19609	 uucp: {ihnp4|allegra|burdvax|rutgers}!cbmvax!sneakers
+-----------------------------------------------------------------------------+
    Call BERKS AMIGA BBS - 24 Hrs - 3/12/2400 Baud - 40Meg - 215/678-7691
+-----------------------------------------------------------------------------+
    Quote: Those who worked the hardest        Gary Ward - Oklahoma State
 	    are the last to surrender                      baseball coach

igp@camcon.uucp (Ian Phillipps) (07/13/88)

From article <788@cunixc.columbia.edu>, by ari@cunixc.columbia.edu (Ari "Juan" Shamash):
> Can somebody please send me the third part of Dhrystone 2.1 (part 3 of 3)..
> I think that is the part that contained the pascal source for the
> program.  For some reason or another, it never made it here..

Nor here - is a repost called for?

-- 
UUCP:  ...!ukc!camcon!igp | Cambridge Consultants Ltd  |  Ian Phillipps
or:    igp@camcon.uucp    | Science Park, Milton Road  |-----------------
Phone: +44 223 358855     | Cambridge CB4 4DW, England |

gillies@p.cs.uiuc.edu (07/14/88)

I certainly find it hard to believe that the top of the line Amdahl
machine achieves 90,000+ Dhrystones with an experimental gnu C
compiler.  It nearly doubles the performance of the best Cray compiler
reported (admittedly, compiling C for a Cray is probably hard, but
Crays are very decent scaler machines!  sheesh).

Maybe BYTE's new benchmarks are more fair.  What we need is a
MEGASTONE or perhaps an ANSIIstone benchmark that tests nearly all the
standard i/o libraries and all the language functions of a compiler.
That would make it difficult to cheat, since the benchmark would be so
large, that it would be fruitless to concentrate on optimizations for
certain coding sequences, or fruitless to optimize just a handful of
library subroutines.

Dhrystone has never been a test of processor MIPS -- this month's ACM
SIGPLAN states that it is intended to measure:

	The performance of "real" Operating System code.

I think it's about time we started benchmarking the architecture +
compiler together.  Compiler researchers have got their sh*t together,
so we may as well throw them into the performance rat-race!


Don Gillies, Dept. of Computer Science, University of Illinois
1304 W. Springfield, Urbana, Ill 61801      
ARPA: gillies@cs.uiuc.edu   UUCP: {uunet,ihnp4,harvard}!uiucdcs!gillies

henry@utzoo.uucp (Henry Spencer) (07/19/88)

In article <76700035@p.cs.uiuc.edu> gillies@p.cs.uiuc.edu writes:
>I certainly find it hard to believe that the top of the line Amdahl
>machine achieves 90,000+ Dhrystones...
> It nearly doubles the performance of the best Cray compiler
>reported (admittedly, compiling C for a Cray is probably hard, but
>Crays are very decent scaler machines!  sheesh)...

Ah, but Crays are *not* good character machines, and Dhrystone is known
to be excessively string-intensive.
-- 
Anyone who buys Wisconsin cheese is|  Henry Spencer at U of Toronto Zoology
a traitor to mankind.  --Pournelle |uunet!mnetor!utzoo! henry @zoo.toronto.edu

tim@amdcad.AMD.COM (Tim Olson) (07/20/88)

In article <1988Jul18.231331.19575@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
| In article <76700035@p.cs.uiuc.edu> gillies@p.cs.uiuc.edu writes:
| >I certainly find it hard to believe that the top of the line Amdahl
| >machine achieves 90,000+ Dhrystones...
| > It nearly doubles the performance of the best Cray compiler
| >reported (admittedly, compiling C for a Cray is probably hard, but
| >Crays are very decent scaler machines!  sheesh)...
| 
| Ah, but Crays are *not* good character machines, and Dhrystone is known
| to be excessively string-intensive.

And the Amdahl machine referenced is a dual-processor model -- I assume
that this was 45K Dhrystones per processor...

-- 
	-- Tim Olson
	Advanced Micro Devices
	(tim@delirun.amd.com)

mash@mips.COM (John Mashey) (07/20/88)

In article <22406@amdcad.AMD.COM> tim@delirun.amd.com (Tim Olson) writes:
>In article <1988Jul18.231331.19575@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
>| In article <76700035@p.cs.uiuc.edu> gillies@p.cs.uiuc.edu writes:
>| >I certainly find it hard to believe that the top of the line Amdahl
>| >machine achieves 90,000+ Dhrystones......

>And the Amdahl machine referenced is a dual-processor model -- I assume
>that this was 45K Dhrystones per processor...

I believe the original statement was correct, i.e., I don't think the
Amdahl guys were aggregating 2 CPUs' Dhrystones.

Note that a 20-vups CPU does over 40K Dhrystones,
and one would expect each (50? 45?)-vup 5990 CPU to do 90K-100K Dhrystones.
This is reminiscent of the "Brash Micros vs Big Iron" discussion of
several months' back, i.e., Big Iron is still ahead....
-- 
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	{ames,decwrl,prls,pyramid}!mips!mash  OR  mash@mips.com
DDD:  	408-991-0253 or 408-720-1700, x253
USPS: 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086

littauer@amdahl.uts.amdahl.com (Tom Littauer) (07/20/88)

In article <22406@amdcad.AMD.COM> tim@delirun.amd.com (Tim Olson) writes:
>In article <1988Jul18.231331.19575@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
>| In article <76700035@p.cs.uiuc.edu> gillies@p.cs.uiuc.edu writes:
>| >I certainly find it hard to believe that the top of the line Amdahl
>| >machine achieves 90,000+ Dhrystones...
>| > It nearly doubles the performance of the best Cray compiler
>| >reported (admittedly, compiling C for a Cray is probably hard, but
>| >Crays are very decent scaler machines!  sheesh)...
>| 
>| Ah, but Crays are *not* good character machines, and Dhrystone is known
>| to be excessively string-intensive.
>
>And the Amdahl machine referenced is a dual-processor model -- I assume
>that this was 45K Dhrystones per processor...

No, 91K per each of 2 in a 5990-700 and 4 in a -1400... To be fair, that's
not the released compiler (it was GNU cc). Does GNU do Dhrystone
fakery? Anyway, the released compiler was 74K per each and it isn't
optimized for Dhrystone. Our processor guys bitch that Dhrystone doesn't
show our processors to be as fast as they really are, but that's an
entirely different discussion.
-- 
UUCP:  littauer@amdahl.amdahl.com
  or:  {sun,decwrl,hplabs,pyramid,ames,uunet}!amdahl!littauer
DDD:   (408) 737-5056
USPS:  Amdahl Corp.  M/S 337,  1250 E. Arques Av,  Sunnyvale, CA 94086

I'll tell you when I'm giving you the party line. The rest of the time
it's my very own ravings (accept no substitutes).

tim@amdcad.AMD.COM (Tim Olson) (07/20/88)

In article <9amsvb52K11010cyawo@amdahl.uts.amdahl.com> littauer@amdahl.uts.amdahl.com (Tom Littauer) writes:
| No, 91K per each of 2 in a 5990-700 and 4 in a -1400... To be fair, that's
| not the released compiler (it was GNU cc). Does GNU do Dhrystone
| fakery? Anyway, the released compiler was 74K per each and it isn't
| optimized for Dhrystone. Our processor guys bitch that Dhrystone doesn't
| show our processors to be as fast as they really are, but that's an
| entirely different discussion.

Whoops!  My mistake.  I had edited Mr Richardson's posting and removed
(what I *thought* were superfluous!) columns to get it on an 80col
display -- one that I had removed explicitly stated that the values were
per each CPU in the system.  Thanks for clarifying this.  Gee, now we
have to aim even higher! ;-)

-- 
	-- Tim Olson
	Advanced Micro Devices
	(tim@delirun.amd.com)

chuck@amdahl.uts.amdahl.com (Charles Simmons) (07/20/88)

In article <22406@amdcad.AMD.COM> tim@delirun.amd.com (Tim Olson) writes:
>In article <1988Jul18.231331.19575@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
>| In article <76700035@p.cs.uiuc.edu> gillies@p.cs.uiuc.edu writes:
>| >I certainly find it hard to believe that the top of the line Amdahl
>| >machine achieves 90,000+ Dhrystones...
>| > It nearly doubles the performance of the best Cray compiler
>| >reported (admittedly, compiling C for a Cray is probably hard, but
>| >Crays are very decent scaler machines!  sheesh)...
>| 
>| Ah, but Crays are *not* good character machines, and Dhrystone is known
>| to be excessively string-intensive.
>
>And the Amdahl machine referenced is a dual-processor model -- I assume
>that this was 45K Dhrystones per processor...
>
>-- 
>	-- Tim Olson
>	Advanced Micro Devices
>	(tim@delirun.amd.com)

Sorry Tim.  Yes, it is a dual-processor model.  But the measurement
was 90K per processor.

-- Chuck

chuck@amdahl.uts.amdahl.com (Charles Simmons) (07/20/88)

In article <9amsvb52K11010cyawo@amdahl.uts.amdahl.com> littauer@amdahl.uts.amdahl.com (Tom Littauer) writes:
>In article <22406@amdcad.AMD.COM> tim@delirun.amd.com (Tim Olson) writes:
>>In article <1988Jul18.231331.19575@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
>>| In article <76700035@p.cs.uiuc.edu> gillies@p.cs.uiuc.edu writes:
>>| >I certainly find it hard to believe that the top of the line Amdahl
>>| >machine achieves 90,000+ Dhrystones...
>>| > It nearly doubles the performance of the best Cray compiler
>>| >reported (admittedly, compiling C for a Cray is probably hard, but
>>| >Crays are very decent scaler machines!  sheesh)...
>>| 
>>| Ah, but Crays are *not* good character machines, and Dhrystone is known
>>| to be excessively string-intensive.
>>
>>And the Amdahl machine referenced is a dual-processor model -- I assume
>>that this was 45K Dhrystones per processor...
>
>No, 91K per each of 2 in a 5990-700 and 4 in a -1400... To be fair, that's
>not the released compiler (it was GNU cc). Does GNU do Dhrystone
>fakery? Anyway, the released compiler was 74K per each and it isn't
>optimized for Dhrystone. Our processor guys bitch that Dhrystone doesn't
>show our processors to be as fast as they really are, but that's an
>entirely different discussion.

Since there has been some discussion recently on how an Amdahl
machine with the GNU cc compiler can achieve 91K dhrystones per
processor, I thought that I'd go into a little bit of a discussion
of just what the GNU cc compiler does that achieves this performance.
I'll then let you all decide if GNU does "Dhrystone fakery".

First, I like Henry Spencer's comment very much about Crays not
being good character processing machines.  Both the Cray and the
Amdahl machines use very up-to-date technology, and there is no
good reason for believing that a multi-million dollar Amdahl
machine can't execute the Dhrystone benchmark faster than a
multi-million dollar Cray.

Cray machines are optimized for programs that require lots of
memory (say 1 Gbyte or so), floating point computations, and vector
computations.  Amdahl machines are optimized for smaller amounts
of memory (say 256 Mbytes), and scalar processing that is
non-floating-point intensive.  (These are my personal beliefs until
I'm corrected by someone who knows more.)

The GNU compiler achieves extremely good Dhrystone results as
compared to the current pcc based compiler primarily through
two mechanisms.  First, the GNU compiler performs reasonably good
register allocation.  Since the Dhrystones spend much of their
time in relatively short routines that use relatively few registers,
the GNU compiler can frequently keep all of the values that are
needed for a routine in registers that do not need to be saved on
the stack.  In addition, I have a special case optimization that I
perform so that when a subroutine does not call any other subroutine,
much of the procedure set up code is optimized away (e.g. I don't
allocate a stack frame if I don't need one).

Thus, for Dhrystones, GCC performs good register allocation, and it
generates code that keeps the overhead of calling a subroutine to
a minimum.

We do not use in-line subroutines (I tried, but GCC generated incorrect
code), nor do we do anything remotely resembling link-time register
allocation.  (Rest assured that if I do figure out ways to do things
like this, I will report this type of optimization with any results
that I publish.)

I would only make the following conclusions from these GCC results:

1)  An Amdahl mainframe is lots faster than a Vax 750.

2)  For some applications, an Amdahl mainframe may outperform a Cray.

3)  The GNU C compiler does a fairly good job of register allocation
(especially in small routines that use single word registers).

4)  The GNU C compiler is easily modified to make special case
optimizations (such as allocating stack frames only when they
are needed).

In summary, I would not be offended if anyone decided that the 91K
figure that I've published were considered a research result, but
not something that could be realistically attained using the production
compiler that we supply.  The 91K figure should be viewed as a
theoretical upper bound (a goal to shoot for), and an indication of the
types of performance levels that can be achieved in hand-coded assembler.

Still, 74K Dhrystones per processor head isn't too shabby.

(To follow up on the last comment of Tom's...  To really benchmark
an Amdahl mainframe against other types of machines, we would prefer
a benchmark that used over 100Mbytes of memory and which did lots
of I/O.  It would be fun to publish numbers that show a 68020 based
machine thrashing on the benchmark for many hours (days?) while the
mainframe completes the job in a couple minutes.)

-- Chuck

mash@mips.COM (John Mashey) (07/21/88)

In article <9a0K/cbluk1010IHSPc@amdahl.uts.amdahl.com> chuck@amdahl.uts.amdahl.com (Charles Simmons) writes:
...
>Cray machines are optimized for programs that require lots of
>memory (say 1 Gbyte or so), floating point computations, and vector
>computations.  Amdahl machines are optimized for smaller amounts
>of memory (say 256 Mbytes), and scalar processing that is
>non-floating-point intensive...

You're being modest.  I thought Amdahls were designed for general
purpose performance across a wide range of programs, certainly including
integer and and scalar floating point, not that vector FP is
particularly shabby.
-- 
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	{ames,decwrl,prls,pyramid}!mips!mash  OR  mash@mips.com
DDD:  	408-991-0253 or 408-720-1700, x253
USPS: 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086

drew@sdeggo.UUCP (Drew Dean) (07/21/88)

In article <9a0K/cbluk1010IHSPc@amdahl.uts.amdahl.com>, chuck@amdahl.uts.amdahl.com (Charles Simmons) writes:
> 
> (To follow up on the last comment of Tom's...  To really benchmark
> an Amdahl mainframe against other types of machines, we would prefer
> a benchmark that used over 100Mbytes of memory and which did lots
> of I/O.  It would be fun to publish numbers that show a 68020 based
> machine thrashing on the benchmark for many hours (days?) while the
> mainframe completes the job in a couple minutes.)
> 
> -- Chuck
Why would a 68020 thrash on a 100Mb benchmark, assuming the system has
enough RAM ?  I thought I've seen ads for 680x0 machines with 128Mb limits
on system RAM, am I just wishfully dreaming ?  Given that Cray disks are
18ms, (although they do transfer 10Mbytes/sec), one can buy 18ms, 320Mb
Priams (and also Maxtors, etc) ESDI disks, so access time isn't a major
factor.  (Aside:  Does anyone know how big Core's 10ms disks come ?  I know
they make a 40Mb drive, and drives as large as 250Mb, but do they make a
250Mb, 10ms drive ?)
While we're discussing 680x0 benchmarks, does anyone have any numbers for
the new Apollo machines, the 3500/4500 (25 & 33 Mhz 68030's/ = speed 68882
FPU) ?

Drew Dean
drew@sdeggo.UUCP
FROM Disclaimers IMPORT StandardDisclaimer;

seibel@cgl.ucsf.edu (George Seibel) (07/21/88)

In article <9a0K/cbluk1010IHSPc@amdahl.uts.amdahl.com> chuck@amdahl.uts.amdahl.com (Charles Simmons) writes:

>Cray machines are optimized for programs that require lots of
>memory (say 1 Gbyte or so), floating point computations, and vector
             [???]                 [yes]                     [yes]
>computations.  Amdahl machines are optimized for smaller amounts
>of memory (say 256 Mbytes), and scalar processing that is
>non-floating-point intensive.  (These are my personal beliefs until
>I'm corrected by someone who knows more.)

What kind of Cray are we talking about here?  There are still some antique
Cray 1's out there with a grand total of ONE Mword!  (8 Mbyte)  The primary
use for these machines today is benchmarking by minisuper vendors ( :-))
Most Crays are XMPs with 2-8 Mword, a few 16 Mword.  (Most of the 8-16 Meg
X's have 4 processors, so there's not much memory per processor)  There are
a handful of Cray 2s with GOBS of memory - they didn't go over very well
compared to the X's.  The YMP is the latest unit - 8 processors, 32 Mwords.
The Cray 3 should be out soon, and will be a huge memory model.  But for the
time being, the typical Cray in the field is pretty strapped for memory.

>1)  An Amdahl mainframe is lots faster than a Vax 750.

I love this line.

George Seibel, UCSF
seibel@cgl.ucsf.edu