[comp.arch] Round-off

sims@stsci.EDU (Jim Sims) (01/09/88)

I have been told and am seeking to confirm that the 309x (and others?) series
from IBM does HEX roundoff. Is this for real?
-- 
    Jim Sims     Space Telescope Science Institute  Baltimore, MD 21218
    UUCP:  {arizona,decvax,hao,ihnp4}!noao!stsci!sims
    SPAN:  {SCIVAX,KEPLER}::SIMS
    ARPA:  sims@stsci.edu

lindsay@K.GP.CS.CMU.EDU (Donald Lindsay) (01/10/88)

In article <189@mithras> sims@stsci.EDU (Jim Sims) writes:
>I have been told and am seeking to confirm that the 309x (and others?) series
>from IBM does HEX roundoff. Is this for real?

Yes. The floating point format normalizes to the hex digit, not to the 
bit level. This means that a normalized number can still have one (or two
or three) leading zero bits.

Gene Amdahl has claimed that he did this in the hope of saving hardware.
Apparently the saving wasn't worth much. The downside is mostly apparent
to numeric analysts, who can't characterize accumulated roundoff as well
as they'd like.
-- 
	Don		lindsay@k.gp.cs.cmu.edu    CMU Computer Science

hes@ecsvax.UUCP (Henry Schaffer) (01/10/88)

In article <614@PT.CS.CMU.EDU>, lindsay@K.GP.CS.CMU.EDU (Donald Lindsay) writes:
> In article <189@mithras> sims@stsci.EDU (Jim Sims) writes:
> >I have been told and am seeking to confirm that the 309x (and others?) series
> >from IBM does HEX roundoff. Is this for real?
> 
> Yes. The floating point format normalizes to the hex digit, not to the 
> bit level. This means that a normalized number can still have one (or two
> or three) leading zero bits.
> 
> Gene Amdahl has claimed that he did this in the hope of saving hardware.
                                                          ^^^^^^^^^^^^^^^
This tradeoff goes back to the original 360 - and seems to have originated
with the desire to have a floating point word fit in 4 bytes, and so the
exponent was in 1 byte.  In order to have an acceptable range of
magnitude, the exponent had to shift more than binary, and HEX was chosen.
Of course with hex normalization, any hex digit could be most significant,
and so necessarily there could be 1-3 leading zero *bits*, but that is just
part of the leading non-zero hex digit.

Many of us were less than happy with the 4 byte floating point word -
feeling that it had less precision than desirable in the mantissa and
less range than desirable in the exponent.  (The 48 bit floating point
in the CDC 1604 had room for a range of 10^+-300 and about 13 decimal 
digits of precision.)  The answer of course was the doubleword REAL*8, but
it still had the limited dynamic range.  I believe the reason the short
precision was chosen as the default was to make the most of the small
memories of those days.  256K or 512K was considered *big*.

> Apparently the saving wasn't worth much. The downside is mostly apparent
> to numeric analysts, who can't characterize accumulated roundoff as well
> as they'd like.
Why not?  Why can't they just treat it as hex roundoff - without worrying
that the first bit in hex 1-7 is a 0?
> -- 
> 	Don		lindsay@k.gp.cs.cmu.edu    CMU Computer Science
--henry schaffer  n c state univ

lindsay@K.GP.CS.CMU.EDU (Donald Lindsay) (01/11/88)

In article <4404@ecsvax.UUCP> hes@ecsvax.UUCP (Henry Schaffer) writes:
>In article <614@PT.CS.CMU.EDU>, lindsay@K.GP.CS.CMU.EDU (Donald Lindsay) writes:
>> The floating point format normalizes to the hex digit, not to the 
>> bit level. This means that a normalized number can still have one (or two
>> or three) leading zero bits.
>> Gene Amdahl has claimed that he did this in the hope of saving hardware.
>                                                          ^^^^^^^^^^^^^^^
>This tradeoff goes back to the original 360 ...

Yes, Amdahl was the designer of the 360. My statement was based on a
statement by one of gentlemen who designed the IEEE format. He said
that he, personally, had gone into Amdahl's office and asked why. Amdahl's
answer was reportedly that he hoped to save logic in the normalization
section of FPU's.

If you read the original articles about the 360 91, you'll find mention
of the normalization hardware: IBM thought it was hot stuff.

The numeric analysis problem was actually even worse on some other machines.
For example, there was one which sometimes truncated, and sometimes
rounded - and the analyst couldn't predict which he'd get !
-- 
	Don		lindsay@k.gp.cs.cmu.edu    CMU Computer Science

jejones@mcrware.UUCP (James Jones) (01/11/88)

In article <4404@ecsvax.UUCP>, hes@ecsvax.UUCP (Henry Schaffer) writes:
> Why not?  Why can't [numerical analysts] just treat it as hex roundoff -
> without worrying that the first bit in hex 1-7 is a 0?

It turns out that there are some proofs of convergence that count on the
property

			(x * 2) / 2 = x

(aside from over/underflow), and using base 16 violates that property.
In fact, I recall that the ITF BASIC and PL/I exp() function once had
a bug that caused exp(1 +/- epsilon) to be zero for 0 < epsilon < some
(small) value x, and I believe that the cause of the bug was the 360/370's
use of base 16 floating-point arithmetic.

		James Jones

hutchson@convexc.UUCP (01/12/88)

An example of the sort of behavior that makes numerical analysists run
screaming from hexadecimal floating point:
(a+b)/2 is NOT always in the closed interval [a,b]!

hes@ecsvax.UUCP (Henry Schaffer) (01/12/88)

In article <589@mcrware.UUCP>, jejones@mcrware.UUCP (James Jones) writes:
> In article <4404@ecsvax.UUCP>, hes@ecsvax.UUCP (Henry Schaffer) writes:
> > Why not?  Why can't [numerical analysts] just treat it as hex roundoff -
> > without worrying that the first bit in hex 1-7 is a 0?
> 
> It turns out that there are some proofs of convergence that count on the
> property
> 
> 			(x * 2) / 2 = x
> 
> (aside from over/underflow), and using base 16 violates that property.
  Clearly this is always true for binary floating point, because the mult
and div only affect the exponent.  In hex it is not always true - but this
also is (it seems to me) to be the case for *any* other base than binary.
Does this mean that only computers with binary floating point (including
binary normalization) are acceptable for numerical analysis?

  If so, then I object.  The only reason we build computers with binary
circuitry is the conjunction of economics with engineering/physics.  If we
had good deci-stable devices (cf. bi-stable) then we could have computers
which were internally decimal - and that would be preferable.  I would hope
that numerical analysts could deal with such computers.   Humph!  :-)

>... 
> 		James Jones
--henry schaffer  n c state univ

ok@quintus.UUCP (Richard A. O'Keefe) (01/13/88)

In article <4412@ecsvax.UUCP>, hes@ecsvax.UUCP (Henry Schaffer) writes:
> circuitry is the conjunction of economics with engineering/physics.  If we
> had good deci-stable devices (cf. bi-stable) then we could have computers
> which were internally decimal - and that would be preferable.  I would hope
> that numerical analysts could deal with such computers.   Humph!  :-)

There was an article in the Australian Computer Journal sometime in the
early 70s which described a method where a BINARY computer could provide
DECIMAL floating point.  {The exponent and significand would be binary
numbers, but the base was 10.}  It was claimed that the hardware for this
would not be more complex than the hardware for base 8 (Burroughs main-
frames) or for base 16 (IBM).  The method was implemented in software
for a student BASIC system written at the University of Auckland.  The
big advantage of binary-but-base-10 floating point is that there is NO
conversion error due to change of base: if you write 0.1, you get 0.1
exactly.  Output conversion is also exact.

Didn't some models of the /360 truncate rather than rounding, or fail
to use guard digits, or something of that sort?  None of the numerical
analysis courses I ever took told me that (x*2)/2 had to equal x.

steve@hubcap.UUCP ("Steve" Stevenson) (01/15/88)

> Didn't some models of the /360 truncate rather than rounding, or fail
> to use guard digits, or something of that sort?  None of the numerical
> analysis courses I ever took told me that (x*2)/2 had to equal x.
 
 We tell 'em: it's ok to balance your checkbook on the 370. :-)
-- 
Steve (really "D. E.") Stevenson           steve@hubcap.clemson.edu
Department of Computer Science,            (803)656-5880.mabell
Clemson University, Clemson, SC 29634-1906

howard@cpocd2.UUCP (Howard A. Landman) (01/20/88)

In article <4404@ecsvax.UUCP> hes@ecsvax.UUCP (Henry Schaffer) writes:
>This tradeoff goes back to the original 360 - and seems to have originated
>with the desire to have a floating point word fit in 4 bytes, and so the
>exponent was in 1 byte.  In order to have an acceptable range of
>magnitude, the exponent had to shift more than binary, and HEX was chosen.
>magnitude, the exponent had to shift more than binary, and HEX was chosen.
>Of course with hex normalization, any hex digit could be most significant,
>part of the leading non-zero hex digit.

Given that hex rounding effectively trashes three bits of mantissa accuracy,
it would have been just as good to use an 11-bit exponent and a 21-bit
mantissa with normal rounding.  I can remember learning to program in PL-I
on an IBM and being REAL SURPRISED when simple calculations gave results that
were off by more than 1%.  Someone finally explained to me that you NEVER,
were off by more than 1%.  Someone finally explained to me that you NEVER,
NEVER, NEVER use what he called IBM's "half precision" floating point format,
unless you only care about the first significant digit.

It was precisely the awfulness of IBM single precision that led Kernighan and
Ritchie to make it a required feature of the C language that all floating
point computations be done in double precision.

I seem to recall Gene Amdahl saying once that if he had it to do over again,
he would have done it differently.  Of course, once I heard someone ask him
whether he thought the IBM 360 architecture was really the best that could be
achieved (it was over 15 years old at the time), and he said something like
"Why don't you ask me if I thought it was the best that could be achieved
THEN?".  A yes answer would imply that it was dated, and a no answer would
imply that some of the decisions were compromises over which he had no control
and with which he didn't agree.

-- 
	Howard A. Landman
	{oliveb,hplabs}!intelca!mipos3!cpocd2!howard
	howard%cpocd2.intel.com@RELAY.CS.N

henry@utzoo.uucp (Henry Spencer) (01/23/88)

> It was precisely the awfulness of IBM single precision that led Kernighan and
> Ritchie to make it a required feature of the C language that all floating
> point computations be done in double precision.

Sorry, not correct.  Remember that C was originally a system implementation
language for the pdp11, not the IBM mainframes.  Dennis Ritchie has talked
about this very issue in the past.  The biggest reason for the rules about
floating-point arithmetic was that the floating-point box in the pdp11 does
not have separate 32-bit and 64-bit instructions.  Instead it has a mode
bit to select floating-point width.  This was a major headache for code
generation, so Dennis cheated by just setting the bit once (the first
instruction of all pdp11 Unix programs is SETD) and doing everything in
double precision.  Contributing reasons were the difficulty of making sure
that function arguments were the right length otherwise, and, yes, the
greater accuracy.  He undoubtedly knew about the IBM rounding problem, and
it may have had some influence, but saying that it was *the* reason is
not right.
-- 
Those who do not understand Unix are |  Henry Spencer @ U of Toronto Zoology
condemned to reinvent it, poorly.    | {allegra,ihnp4,decvax,utai}!utzoo!henry