[net.arch] IBM 360 float architecture problems

johnl@ima.UUCP (07/15/85)

A while ago I said I'd send out a note on how the design of the floating
point unit on the System/360 was botched.  They ended up with a design
which is quite fast but produced poor answers, losing 3 bits of accuracy
in each operation.

Here's the sketch of what IBM did.  They seem to have thought that they could 
get faster floating point using hex rather than binary without losing 
precision, but they were wrong.  The analysis went sort of like this: We look 
at the high order digit of a hex floating point number.  Assuming that leading 
digits are uniformly distributed from 0 to F, on the average there will be one 
leading zero bit.  But since we use hex rather than binary exponents, we can 
make the exponent one bit smaller than if we had a binary exponent and the 
fraction one bit bigger, and you don't lose any precision, get better exponent 
range, and it's faster since you need fewer normalization steps to normalize 
hex rather than binary numbers.  

It turns out that leading digits are geometrically distributed so that on the 
average there are two leading zeros rather than one, so you lose one bit that 
way.  Also, when you use binary floating point you know that the high order 
bit of every normalized number is a 1 so you don't have to store it -- in hex 
this "hidden bit" trick doesn't work.  Third, the 360's floating point unit 
truncates results rather than rounding (again in the interest of speed) which 
is equivalent to keeping one less bit than if you rounded.  This means that
compared to a machine like the VAX, a 360 gets 3 bits less precision on each
floating point operation of the same word size.

The points about faster normalization were true, so the 360's floating point 
unit produces inaccurate results very quickly.  Originally, the floating 
point unit computed stuff only to the fraction size keeping no guard digits, 
which produced incredibly wrong results.  IBM retrofitted guard digits 
shortly after the first 360s were shipped and customers started to complain.  
Their eventual precision fix has been to add quadruple precision floating 
point, since it's too late to change floating point formats now.  

John Levine, Javelin Software, Cambridge MA 617-494-1400
{ decvax!cca | think | ihnp4 | cbosgd }!ima!johnl, Levine@YALE.ARPA

ark@alice.UUCP (Andrew Koenig) (07/18/85)

John Levine argues in a rather lengthy article that the IBM floating
point format loses 3 bits compared with a binary format.  This isn't
quite true; he neglects to mention the gain of two bits because the
exponent is a power of 16 rather than a power of 2.  Of course, when
compared to the VAX, the IBM format choses to realize that gain by
allowing numbers up to 7e75 rather than 1e38, but it's still a gain.

And chopped arithmetic has its advantages too.  For instance, conversion
from double to single cannot overflow, as it can on the VAX.

franka@mmintl.UUCP (Frank Adams) (07/18/85)

In article <36900010@ima.UUCP> johnl@ima.UUCP writes:
>Assuming that leading 
>digits are uniformly distributed from 0 to F, on the average there will be one
>leading zero bit.  But since we use hex rather than binary exponents, we can 
>make the exponent one bit smaller than if we had a binary exponent and the 
>fraction one bit bigger, and you don't lose any precision, get better
>exponent range, and it's faster since you need fewer normalization steps to
>normalize hex rather than binary numbers.  
>
>It turns out that leading digits are geometrically distributed so that on the 
>average there are two leading zeros rather than one, so you lose one bit that 
>way.

Actually, it's worse than that.  The proper measure of the precision is not
the average precision, but the worst precision -- the precision is what you
can count on in your results.  Thus you lose two bits, not one.

carter@masscomp.UUCP (Jeff Carter) (07/20/85)

In article <4006@alice.UUCP> ark@alice.UUCP (Andrew Koenig) writes:
>John Levine argues in a rather lengthy article that the IBM floating
....etc..
>allowing numbers up to 7e75 rather than 1e38, but it's still a gain.
>
>And chopped arithmetic has its advantages too.  For instance, conversion
>from double to single cannot overflow, as it can on the VAX.

There seems to be a little confusion here concerning chopped
arithmetic/IBM format/VAX format for floating point.
The reason that you can't overflow from double -> single on some
hardware architectures relates to the width of the exponent field in
the floating point number. In the IEEE format, 8 bits are used
in single precision for the biased exponent, and 11 bits are used
in double precision. It does NOT matter if you round/chop/clear/set
the low order bits, the number will overflow if the exponent won't 
fit in the space allotted. Its been a long time since I used 
360/370 family machines, but I seem to remember that there 
are indeed more exponent bits in double precision than single.
( I could be wrong on the IBM details a lot of water's passed under
  the bridge since I last used an IBM, I prefer computers that
  accuarately reflect the calculations I intended to perform )
This means that the IBM *will* overflow on conversions, totally
unrelated to the rounding mode.
The last machine that I worked on that used the same number of bits
for exponent in both SP and DP was a DG MV/8000 (and I presume the 4, 6, & 10)
The 3 saved bits are used as part of the mantissa, resulting in much 
more accurate (but smaller dynamic range ) calculations.

Jeff Carter
MASSCOMP
1 Technology Park, Westford, MA 01886
UUCP: ....!{ihnp4|allegra|decvax}!masscomp!carter

ark@alice.UUCP (Andrew Koenig) (07/20/85)

Jeff Carter tried to correct a point I made with too little detail:

> The reason that you can't overflow from double -> single on some
> hardware architectures relates to the width of the exponent field in
> the floating point number.

This may well be true, but is unrelated to the point I made.

Let me be more specific.  The VAX floating-point format is
identical in single and double precisions except for the
number of mantissa bits.  However, overflow is still possible
on double->single conversion because the VAX rounds rather
than chops.  All you need to do is convert some number
with the maximum possible exponent, all the high-order
mantissa bits (i. e.: the ones that correspond to short format)
turned on, and the next bit also turned on.  Then the rounding
that accompanies the conversion will cause an overflow.

Jeff goes on to say:

> Its been a long time since I used 
> 360/370 family machines, but I seem to remember that there 
> are indeed more exponent bits in double precision than single.
> ( I could be wrong on the IBM details a lot of water's passed under
>   the bridge since I last used an IBM, I prefer computers that
>   accuarately reflect the calculations I intended to perform )

He is indeed wront; the 370 has 7 exponent bits in both single
and double.  Since the exponent represents a power of 16, the
largest magnitude is about 7.237e75.

herbie@watdcsu.UUCP (Herb Chong [DCS]) (07/20/85)

In article <741@masscomp.UUCP> carter@masscomp.UUCP (Jeff Carter) writes:
>Its been a long time since I used 
>360/370 family machines, but I seem to remember that there 
>are indeed more exponent bits in double precision than single.
>( I could be wrong on the IBM details a lot of water's passed under
>  the bridge since I last used an IBM, I prefer computers that
>  accuarately reflect the calculations I intended to perform )
>This means that the IBM *will* overflow on conversions, totally
>unrelated to the rounding mode.

the first eight bits are sign, and 7 bit exponent in excess 64 notation
to the base 16 in all three floating point formats on 360/370
machines.  i have the principle of operations open right in front of me
as i write this.  you get maximum of 24, 56, and 112 bits of precision in
short, long, and extended floating point.

Herb Chong...

I'm user-friendly -- I don't byte, I nybble....

UUCP:  {decvax|utzoo|ihnp4|allegra|clyde}!watmath!water!watdcsu!herbie
CSNET: herbie%watdcsu@waterloo.csnet
ARPA:  herbie%watdcsu%waterloo.csnet@csnet-relay.arpa
NETNORTH, BITNET, EARN: herbie@watdcs, herbie@watdcsu

hes@ecsvax.UUCP (Henry Schaffer) (07/22/85)

Really Re: IBM 360 exponent length
  I believe that the exponent stays the same for single, double
 (and quadruple) precision.  That's a plus in changing precisions,
but a loss in that the higher precisions could well spare a few
bits to greatly extend the range of possible magnitudes.  An extra
two bits in the exponent would increase the 10E78 to 10E300 or
so.  In the types of scientific computing I've done that would make
a substantial improvement.  --henry schaffer

carter@masscomp.UUCP (Jeff Carter) (07/23/85)

My apologies to all those who noticed my error concerning
360 floating point format. That should teach me to trust my 
memory. But, I still wonder what the great advantage of chopped
arithmetic is ? That was the original subject of the article 
replied to. First the correction:: The IBM floating point format
(as with several other vendors) uses the same number of exponent
bits for all precisions. On to the other issue: One (alleged) advantage 
of chopped (as opposed to rounded) arithmetic is that you can never
overflow on a conversion of double -> single precision, given that
the number of exponent bits is the same. The only time this could 
occur is at the very edges of the dynamic range of representable
numbers. What you give up for this is one bit of precision on *all*
calculations. Is this a fair trade? I'm not entirely convinced.
If there are others who firmly believe that this is the best way to 
do it, I'm always glad to listen. 
I'm a firm believer in the IEEE representation. As Henry
Shaffer pointed out, 10^300 is sometimes a realistic number, so
it needs to be provided for.

thomson@uthub.UUCP (Brian Thomson) (07/25/85)

> One (alleged) advantage 
> of chopped (as opposed to rounded) arithmetic is that you can never
> overflow on a conversion of double -> single precision, given that
> the number of exponent bits is the same. The only time this could 
> occur is at the very edges of the dynamic range of representable
> numbers.

Since the rounded result is more accurate, it should be argued that
in such a situation overflow is the preferred result.  The defence
of "it's not a bug it's a feature" was probably invented by a
marketing department.
-- 
		    Brian Thomson,	    CSRI Univ. of Toronto
		    {linus,ihnp4,uw-beaver,floyd,utzoo}!utcsrgv!uthub!thomson