[comp.arch] Frequency of elementary functions

bron@bronze.wpd.sgi.com (Bron Campbell Nelson) (05/18/91)

In article <9105171235.AA20139@ucbvax.Berkeley.EDU>, jbs@WATSON.IBM.COM writes:
> 
>          I agree that easier communication between the units would help
> in computing elementary functions.  However the gains would not be
> spectacular (I guess 10% or so) and elementary functions themselves don't
> seem all that important (anybody have numbers on this?).

In my experiance, even for floating point intensive scientific codes,
the only functions that really use any time are square root and
exponentiation.  Square root will often account for between 5% and 15% of
the cpu time (shading to the high side), and exponentiation between 1% and
5% (shading to the low side).  (Exponentiation would be higher except that
the vast majority of exponents are "2", and many optimizers convert x**2
into x*x.)

Now of course, a *few* codes make much more extensive use of things like
sine, cosine, etc.  But I have not found them to be "typical", even for
scientific codes (which, arguably, are already in and of themselves not
typical of codes in general).  Your mileage may vary.

The upshot of this is that if you took the transistors and the design
time that you used to implement these instructions, and instead devoted
them to, say, making floating point multiplication run 1 clock faster,
you would buy a much larger performance gain over a much wider range
of programs.

---------------------------------------------------------------
Bron Campbell Nelson       | "The usual approach is to pick one
Silicon Graphics, Inc.     | of several revolting kludges."
2011 N. Shoreline Blvd.    |              Henry Spencer
Mtn. View, CA  94039       |___________________________________
bron@sgi.com
These statements are my own, not those of Silicon Graphics.

hrubin@pop.stat.purdue.edu (Herman Rubin) (05/18/91)

In article <104811@sgi.sgi.com>, bron@bronze.wpd.sgi.com (Bron Campbell Nelson) writes:
> In article <9105171235.AA20139@ucbvax.Berkeley.EDU>, jbs@WATSON.IBM.COM writes:
> > 
> >          I agree that easier communication between the units would help
> > in computing elementary functions.  However the gains would not be
> > spectacular (I guess 10% or so) and elementary functions themselves don't
> > seem all that important (anybody have numbers on this?).
> 
> In my experiance, even for floating point intensive scientific codes,
> the only functions that really use any time are square root and
> exponentiation.  Square root will often account for between 5% and 15% of
> the cpu time (shading to the high side), and exponentiation between 1% and
> 5% (shading to the low side).  (Exponentiation would be higher except that
> the vast majority of exponents are "2", and many optimizers convert x**2
> into x*x.)

If square root takes that much of the time, it must be a software square root.
If the hardware has floating division, it should have the simple modification
of this to get a square root, although I agree that too many of them do not,
although the added hardware is extremely small.

As for the other elementary functions, the only hardware algorithms, other than
microcode, which I know would require a lot read-only memory.  But much 
exponentiation in is to small integer powers, and here both integer and floating
exponentiation should be optimized for fixed exponents (although I would be
somewhat hesitant about allowing an optimizer to replace a function call)
and I suspect that hardware for raising to non-negative integer powers 
without computing transcendental functions, even as microcode, would pay.

The original thread, even cited, was the communication between integer and
floating units.  Even such simple things as multiplication by variable
integers are very likely as common as exponentiation.  Another situation
is Boolean operations on both fixed-point and floating numbers; fixed point
is not restricted to integers.  Also, the availability of hardware operations
leads to their increased use; on a machine with no communication between the
units, a knowledgeable programmer would  avoid the need by using some other
algorithm, for example.
-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet)   {purdue,pur-ee}!l.cc!hrubin(UUCP)

bengtl@maths.lth.se (Bengt Larsson) (05/19/91)

In article <12488@mentor.cc.purdue.edu> hrubin@pop.stat.purdue.edu (Herman Rubin) writes:
>
>The original thread, even cited, was the communication between integer and
>floating units.  Even such simple things as multiplication by variable
>integers are very likely as common as exponentiation.  

Give some qualified reason for it being likely common. 

>Another situation
>is Boolean operations on both fixed-point and floating numbers; fixed point
>is not restricted to integers.  

In what algorithms?

>Also, the availability of hardware operations
>leads to their increased use; on a machine with no communication between the
>units, a knowledgeable programmer would  avoid the need by using some other
>algorithm, for example.

Apparently not, if the analysis of various (old) CISC machines count.
There were many operations which were very seldom used, even though
they existed.

Bengt Larsson.
-- 
Bengt Larsson - Dep. of Math. Statistics, Lund University, Sweden
Internet: bengtl@maths.lth.se             SUNET:    TYCHE::BENGT_L

hrubin@pop.stat.purdue.edu (Herman Rubin) (05/19/91)

In article <1991May19.080051.29716@lth.se>, bengtl@maths.lth.se (Bengt Larsson) writes:
> In article <12488@mentor.cc.purdue.edu> hrubin@pop.stat.purdue.edu (Herman Rubin) writes:
> >
> >The original thread, even cited, was the communication between integer and
> >floating units.  Even such simple things as multiplication by variable
> >integers are very likely as common as exponentiation.  
> 
> Give some qualified reason for it being likely common. 

Other than having personally used it quite a bit, I cannot give you examples
immediately.  And again we have the "chicken and the egg" phenomenon; those
who understand the weaknesses of the hardware will try to avoid running into
the problems.  The use of the index of a do loop for floating point operations
internally is not unusual.

Also, how would you know if it is being used?  On some machines, converting
an integer to floating is done by a normalized addition of 0.  On the CRAYs,
there is a very odd instruction used for the purpose, but the instruction 
may be used once for converting many integers to floating.  It COULD be
detected, but not that easily.

> >Another situation
> >is Boolean operations on both fixed-point and floating numbers; fixed point
> >is not restricted to integers.  
> 
> In what algorithms?

Many function programs use table lookup.  This requires extracting the
relevant bits from the floating point number, and frequently obtaining
the difference.  Consider the sheer number of operations required on
hardware without the communications capabilities.  Pipelining can reduce
the effect of operation time, but not of number of sequential instructions
required to do a simple task.

Fixed point arithmetic is little used now because the hardware to support
it reasonably well does not exist.  It is worse than the floating problems
before hardware floating arithmetic, especially if floating is automatically
normalized.  THAT feature of "modern" architectures is, in my opinion, a sheer
horror.

In the early FP computers, much function calculation was done in fixed point,
to get increased accuracy at little cost.

> >Also, the availability of hardware operations
> >leads to their increased use; on a machine with no communication between the
> >units, a knowledgeable programmer would  avoid the need by using some other
> >algorithm, for example.
> 
> Apparently not, if the analysis of various (old) CISC machines count.
> There were many operations which were very seldom used, even though
> they existed.

How do you expect users who do not even know of the existence of the operations
to use them?  

There are many more algorithms than are in the philosophy of software, and
especially hardware, designers.
-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet)   {purdue,pur-ee}!l.cc!hrubin(UUCP)

bengtl@maths.lth.se (Bengt Larsson) (05/21/91)

In article <12505@mentor.cc.purdue.edu> hrubin@pop.stat.purdue.edu (Herman Rubin) writes:
>In article <1991May19.080051.29716@lth.se>, bengtl@maths.lth.se (Bengt Larsson) writes:
>> In article <12488@mentor.cc.purdue.edu> hrubin@pop.stat.purdue.edu (Herman Rubin) writes:
>> >The original thread, even cited, was the communication between integer and
>> >floating units.  Even such simple things as multiplication by variable
>> >integers are very likely as common as exponentiation.  
>> 
>> Give some qualified reason for it being likely common. 
>
>Other than having personally used it quite a bit, I cannot give you examples
>immediately.  

Figures.

>The use of the index of a do loop for floating point operations
>internally is not unusual.

"not unusual"? Prove it.

>Many function programs use table lookup.  This requires extracting the
>relevant bits from the floating point number, and frequently obtaining
>the difference.

Would be interesting to hear from someone who has _written_ such
"function programs" (assuming you mean subroutine libraries).

>Consider the sheer number of operations required on
>hardware without the communications capabilities.  

How big a number is "sheer", mr Rubin?

>How do you expect users who do not even know of the existence of the operations
>to use them?

By using a compiler which knows about them. More cost- and people effective
to do it that way, generally.

Bengt Larsson.
-- 
Bengt Larsson - Dep. of Math. Statistics, Lund University, Sweden
Internet: bengtl@maths.lth.se             SUNET:    TYCHE::BENGT_L

jlg@cochiti.lanl.gov (Jim Giles) (05/21/91)

In article <1991May21.010928.20316@lth.se>, bengtl@maths.lth.se (Bengt Larsson) writes:
|> In article <12505@mentor.cc.purdue.edu> hrubin@pop.stat.purdue.edu (Herman Rubin) writes:
|> [...]
|> >The use of the index of a do loop for floating point operations
|> >internally is not unusual.
|> 
|> "not unusual"? Prove it.

I've been staying out of this thread before, but this is a ridiculous.
I suppose if Herman had said that is is not unusual to use the letter
'a' as the first character of a variable name you'd have asked him
to prove that too?  _OF COURSE_ it's common to use integer iteration
variables to drive floating point calculations - it is the _recommended_
way to do so!

     for (i=0;i<10000;i++){            for (x=0.0;x<100.0;x+=0.01){
        x=i*0.01;                         ...
        ...                            }
     }

The left loop is more accurate than the right one.  The right one may not
even take the right number of steps on some machines (and this becomes more
likely as the number of steps increases).

|> [...]
|> >Many function programs use table lookup.  This requires extracting the
|> >relevant bits from the floating point number, and frequently obtaining
|> >the difference.
|> 
|> Would be interesting to hear from someone who has _written_ such
|> "function programs" (assuming you mean subroutine libraries).

I have.  Yes, you have to extract bits from the floating point values
and manipulate them as integer quantities (if you want to get the right
answer before hell freezes over).  Read any book on numerical methods.
Consider, for example "Methods and Programs for Mathematical Functions",
by Stephen Moshier, publ. Ellis Horwood Limited in England, publ. John
Wiley & Sons in USA and Canada, 1989 (programs written in the book are
in C - _really_heavy_ use of frexp, especially in argument reduction).
Only people who don't do much numerical programming (or do it badly)
think this kind of thing is rare or unusual.

Note:  Yes, this means that if you want to do _accurate_ and _efficient_
numerical programming, you often have to do machine dependent things.
That's the way it is.

J. Giles