[comp.parallel] New Bell Award

notes@iuvax.cs.indiana.edu (08/08/89)

Here is an item that should be of interest to this group:

From: George Cybenko <gc@uicsrd.csrd.uiuc.edu>
Date: Fri, 21 Jul 89 21:07:15 CDT
Subject: 1989 Bell Award for Perfect Benchmarks

Gordon Bell is sponsoring a new award for high performance scien-
tific computing that consists of five categories. Contestants can
compete in any number of the categories described below.

           1989 Bell Award for Perfect Benchmark Rules

     Four of the five new award categories are based on the  Per-
fect Club benchmarks, 13 Fortran codes from a range of scientific
and engineering applications, including  fluid  dynamics,  signal
processing,   physical/chemical   computation,   and  engineering
design.  The codes have been collected and ported to a number  of
computer systems by a group of applications experts from industry
and academia.

     The $2,500 prize fund will be distributed appropriately,  at
the  discretion  of  the judges, among the winning entries in the
following five categories:

(1)  Sixteen or Fewer Processors:  The  measure  is  the  fastest
     wall clock time (including I/O) for the entire Perfect suite
     on any computer system that contains no more than 16 proces-
     sors.   The  programs  must  be executed as a sequential job
     stream, i.e., only one of the benchmarks may be executing at
     any moment.  "Computer system" includes distributed systems.
     There are no constraints on modifications that may  be  made
     to  the codes to obtain the results as long as solutions are
     sufficiently close to solutions obtained  by  the  benchmark
     codes.

(2)  More than 16 Processors:  The measure is the same as in  1.,
     except  that the computer system has more than 16 processors
     and all processors must participate in the execution of each
     benchmark.

(3)  Perfect Suite Cost-effectiveness:  The measure is  the  max-
     imum  cost-effectiveness  run  of  the  Perfect  suite where
     cost-effectiveness is defined as 1 divided by the product of
     running  time and the cost, where running time is defined in
     1., and cost is the list price of the  computer  system  and
     software  at the time of the run.  This disqualifies noncom-
     mercial machines from competing, unless they are a  combina-
     tion of commercially available computer systems.

(4)  Algorithms Cost-effectiveness:  The measure is  the  maximum
     cost-effectiveness  for the total running time of four algo-
     rithms (not whole benchmarks) chosen each  year.   The  same
     rules  apply  here  as  in  3., except that the current list
     price is based on the minimum configuration required to  run
     the  algorithm. (Write to address below for more details for
     1989.)

(5)  Perfect Subset:  The measure is the  minimal  running  time,
     defined  as  in 1., but with no restriction on the number of
     processors, for two codes to be selected annually  from  the
     Perfect suite.  (Write to address below for more details for
     1989.)


Processor Definition

     The processor divisions in  1.  and  2.,  although  somewhat
arbitrary,  are  intended  to reflect the broad classes of extant
parallel systems:  current systems range from  small  numbers  of
powerful  processors to large numbers of extremely simple proces-
sors. The division at 16 would move to a larger number over time.
The number of processors is defined as the number of simultaneous
program execution streams, i.e., in effect the number of  program
counters in simultaneous operation.  For example the Cray Y-MP in
operation today has 8 processors and the number is  projected  to
grow  to 16 and 64 for the Cray 3 and 4.  Similarly, the Thinking
Machines Corp. CM2 has up to 4 processors each with 16K  process-
ing elements or is a uniprocessor with 64K processing elements.


The Perfect Club Benchmark Applications

     The Perfect Club was formed with the purpose  of  developing
and applying a scientific methodology for the performance evalua-
tion of supercomputers.  Club members were  drawn  from  industry
and  academic  sectors  and  an initial suite of 13 Fortran codes
were designated as the "Perfect" Benchmark programs.  These codes
were  selected  because they solved fundamental problems across a
variety of applications requiring  supercomputing  performance  -
fluid dynamics, signal processing, physical/chemical computations
and engineering design.  See [BCKK89] for more information  about
the Perfect Club codes.

[BCKK89]  Berry, M., Chen, D., Koss, P., Kuck, D., Lo, S.,  Pang,
          Y.   Pointer,  L., Roloff, R., Sameh, A., Clementi, E.,
          Chin, S., Schneider, D., Fox, G., Messina, P.,  Walker,
          D.,  Hsiung, C., Schwarzmeier, J., Lue, K., Orszag, S.,
          Seidl, F., Johnson, O., Swanson, G., Goodrum, R.,  Mar-
          tin,  J.,   The Perfect Club Benchmarks: Effective Per-
          formance Evaluation of Supercomputers, CSRD Report  No.
          827,  1988.  (To appear in the International Journal of
          Supercomputing Applications, 1989.)




     The deadline for contest submissions is December 31, 1989.

     For more information about the Perfect Benchmark  Suite  and
     the Bell Award for the Perfect Benchmarks write to:

               Bell Awards for Perfect Benchmarks
               Center for Supercomputing Research and Development
               University of Illinois
               Urbana, IL 61801
               USA

	contact persoin lpointer@uicsrd.csrd.uiuc.edu


NOTE:  IEEE Software administers a separate  prize  sponsored  by
Gordon  Bell.   Contact  the IEEE Software office for information
about that prize.

dinucci@cse.ogc.edu (David C. DiNucci) (08/09/89)

In article <6202@hubcap.clemson.edu> notes@iuvax.cs.indiana.edu writes:
>Here is an item that should be of interest to this group:
>
>
>           1989 Bell Award for Perfect Benchmark Rules
>
>(2)  More than 16 Processors:  The measure is the same as in  1.,
>     except  that the computer system has more than 16 processors
>     and all processors must participate in the execution of each
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>     benchmark.
      ^^^^^^^^^

This rule is non-sensical, and seems to be due to the belief in parallelism
for its own sake, instead of as a means to solve a problem.  A processor is
a resource to a computing system, just as memory is.  It is safe to say that
most readers would find the above rule atrocious if it referred to memory
(e.g. "all 4-MB of memory must be used in the execution of each benchmark").

In fact, if there is insufficient parallelism in some of the benchmarks,a
smart programmer would let some of the processors do senseless work, simply
to meet the letter of the law.  Is this somehow guarded against?

Perhaps someone can explain a rationale behind the above rule?

-Dave
-- 
David C. DiNucci                UUCP:  ..ucbvax!tektronix!ogccse!dinucci
Oregon Graduate Center          CSNET: dinucci@cse.ogc.edu
Beaverton, Oregon

eugene@eos.arc.nasa.gov (Eugene Miya) (08/09/89)

>Perhaps someone can explain a rationale behind the above rule?

He who pays the piper, calls the tune.

Rather than second guess:
My suggestion is to send mail to Gordon Bell.  Post his response.

dinucci@ogccse.ogc.edu (David C. DiNucci) (08/18/89)

In article <6215@hubcap.clemson.edu> dinucci@cse.ogc.edu (David C. DiNucci) writes:
>In article <6202@hubcap.clemson.edu> notes@iuvax.cs.indiana.edu writes:
>>Here is an item that should be of interest to this group:
>>
>>
>>           1989 Bell Award for Perfect Benchmark Rules
>>
>>(2)  More than 16 Processors:  The measure is the same as in  1.,
>>     except  that the computer system has more than 16 processors
>>     and all processors must participate in the execution of each
>          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>     benchmark.
>      ^^^^^^^^^
>
>This rule is non-sensical, and seems to be due to the belief in parallelism
>for its own sake, instead of as a means to solve a problem.  A processor is
>a resource to a computing system, just as memory is.  It is safe to say that
>most readers would find the above rule atrocious if it referred to memory
>(e.g. "all 4-MB of memory must be used in the execution of each benchmark").
>
>In fact, if there is insufficient parallelism in some of the benchmarks,a
>smart programmer would let some of the processors do senseless work, simply
>to meet the letter of the law.  Is this somehow guarded against?
>
>
>Perhaps someone can explain a rationale behind the above rule?

Well, along with some agreement from others, I did receive one personal
response from someone who helped with the development of the rules.  I
post it here with his permission, followed by some comments of my own
which have already been seen by Dr. Cybenko.

=======================================================================
Date: Tue, 15 Aug 89 12:41:38 CDT
From: gc@s16.csrd.uiuc.edu (George Cybenko)
Subject: Re:  Bell Award

Dave:
	If you want to post an "official" response to your question,
that will be difficult because quite a few people were involved in
drafting the "rules".  I can only offer my reasons for thinking that
the part in question was appropriate.  You can post the following
as a personal response.

*********************************************

A number of people have raised questions about Category (2)'s
requirement that "all processors must participate" in the
execution of each benchmark.  As someone involved in putting
those rules together, my thinking was to prevent calling
a supercomputer networked with 16 idle PC's a 17 processor
distributed system.  One should interpret that as "at least
16 processors must meaningfully participate" in the execution
of each benchmark.

It raises the larger question of why have two categories in the first place.
Certainly, if the time with more than 16 processors beats the
time with fewer than 16 processors, it wouldn't be interesting
to have two categories.  However, since most people feel that it
will be a few years before that happens, separate categories
on the basis of processors allows more people to compete  and 
helps gauge the progress being made in the application of
parallel computing to scientific problems.

In the end, no set of rules can replace common sense, consensus
and fair play.

George Cybenko
gc@uicsrd.csrd.uiuc.edu

=======================================================================

My [Dave D's] followup comments:

While this seems to suggest that some crumbs are being tossed to the
parallel processors, in fact it is certainly making them appear worse
than they might if they could be used in a more logical manner - i.e.
using only the processors that are needed to accomplish the task(s) at
hand - and as benchmarks, I would assume that they indeed are intended
to reflect the real world accurately.  (In other words, I suppose my own
personal view is that a Cray on a network with IBM PCs should, in fact,
win if it faster than any parallel processor.)

I'm afraid that the overall results will promote unwarranted comparisons
between (1) and (2), leading to cries of the poor state of parallel
processing.  But, then again, if the field is gaining strength, perhaps
the results in the "cost effectiveness" categories will counteract this
sufficiently.

Dave

Disclaimers:  I have not seen the Perfect Suite, and therefore do not know
how much parallelism is present therein.  Also, my interest is purely
academic, since I do not have the time to participate in the contest.
Also, as usual, I speak for myself, not OGC.

-- 
David C. DiNucci                UUCP:  ..ucbvax!tektronix!ogccse!dinucci
Oregon Graduate Center          CSNET: dinucci@cse.ogc.edu
Beaverton, Oregon